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Abstract 

In  this  paper,  we  consider  how  eigenvalues  of  a  matrix  A  change  when 
it  is  perturbed  to  A  =  DIAD2  and  how  singular  values  of  a  (nonsquare) 
matrix  B  change  when  it  is  perturbed  to  B  =  DIBD2,  where  D 1  and 
D 2  are  assumed  to  be  close  to  unitary  matrices  of  suitable  dimensions. 
We  have  been  able  to  generalize  many  well-known  perturbation  theorems, 
including  Hoffman-Wielandt  theorem  and  Weyl-Lidskii  theorem.  As  ap¬ 
plications,  we  obtained  bounds  for  perturbations  of  graded  matrices  in 
both  singular  value  problems  and  nonnegative  definite  Hermitian  eigen¬ 
value  problems. 


1  Introduction 

Relative  perturbation  theory  for  eigensystems  and  singular  systems  has  been 
becoming  a  hot  topic  in  the  last  five  years  and  ever  since  It  was  first  studied  by 
Kahan  [18]  in  1966,  later  by  [1,  6,  8,  9,  29]  and  most  recently  by  [7,  10,  11,  13, 
15,  25], 
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1.1  What  to  be  Covered? 

This  paper  deals  with  perturbations  of  the  following  kinds: 

•  Eigenvalue  problems: 

1.  A  and  A  =  D* AD  for  Hermitian  case,  where  D  is  nonsingular  and 
close  to  I  or  more  generally  to  a  unitary  matrix; 

2.  A  and  A  =  D\AD2  for  general  diagonalizable  case,  where  D\  and 
D2  are  nonsingular  and  close  to  I  or  more  generally  to  some  unitary 
matrix; 

3.  H  =  D*  AD  and  H  =  D*  AD  for  graded  nonnegative  Hermitian  case, 
where  it  is  assumed  that  A  and  A  are  nonsingular  and  often  that  D 
is  a  highly  graded  diagonal  matrix  (this  assumption  is  not  necessary 
to  our  theorems  below). 

•  Singular  value  problems: 

1.  B  and  B  =  D\BD 2,  where  D\  and  D2  are  nonsingular  and  close  to 
I  or  more  generally  to  two  unitary  matrices; 

2.  G  =  BD  and  G  =  BD  for  graded  case,  where  it  is  assumed  that  B 
and  B  are  nonsingular  and  often  that  It  is  a  highly  graded  diagonal 
matrix  (this  assumption  is  not  necessary  to  our  theorems  below). 

The  above  perturbations  include  component-wise  relative  perturbations  of  the 
entries  in  symmetric  tridiagonal  matrices  with  zero  diagonal  [8,  18] ,  in  bidiagonal 
and  biacyclic  matrices  [1,  7,  8],  in  graded  nonnegative  Hermitian  matrices  [9,  25] 
and  in  graded  matrices  of  singular  value  problems  [9,  25]  and  more  [10]. 

1.2  Notation 

We  will  adopt  this  convention:  capital  letters  denote  unperturbed  matrices  and 
capital  letters  with  tilde  denote  their  perturbed  ones.  For  example,  X  is  per¬ 
turbed  to  X . 

Throughout  the  paper,  capital  letters  are  for  matrices,  lowercase  Latin  let¬ 
ters  for  column  vectors  or  scalars,  and  lowercase  Greek  letters  for  scalars.  The 
following  is  a  detailed  list  of  our  notation,  but  still  more  notation  will  be  intro¬ 
duced  when  it  appears  for  the  first  time. 


2 


Cmxn  :  the  set  ofmxn  complex  matrices; 
x  1  . 

C:  C1 ; 

Mmxn:  the  set  ofmxn  real  matrices; 

mm:  mmxl; 

M:  IR1; 

UTn:  the  set  ofnxn  unitary  matrices; 

0m  n :  the  m  x  n  zero  matrix  (we  may  simply 
write  0  instead); 

In :  the  n  x  n  identity  matrix  (we  may  sim¬ 

ply  write  I  instead); 

X*  :  the  complex  conjugate  of  a  matrix  X ; 

A(X):  the  set  of  the  eigenvalues  of  X, 

counted  according  to  their  algebraic 
multiplicities; 

cr(X):  the  set  of  the  singular  values  of  X, 

counted  according  to  their  algebraic 
multiplicities; 


Cmin(A): 

the  smallest  singular  value  of  X  £ 

xn  . 

crmax(A): 

the  largest  singular  value  of  X  £  Cmxn  ; 

im|2: 

the  spectral  norm  of  X,  <rmax(X); 

imk: 

the  Frobenius  norm  of  X,  D2\xij\2, 

\Ji,j 

where  X  =  (*ij); 

imiP: 

the  p-Holder  operator  norm  of  X  to  de¬ 
fined  later; 

lll*lll: 

some  unitary  invariant  norm  of  X  to 
defined  later. 

1.3  Organization  of  the  Paper 

In  §2  ,  we  define  two  kinds  of  relative  distances  which  will  be  heavily  used  in  the 
rest  of  this  paper.  It  is  proved  in  Appendixes  A  and  B  that  the  relative  distances 
are  really  (generalized)  metrics  on  the  space  of  nonnegative  real  numbers  or  that 
of  nonpositive  real  numbers  and  that  some  of  them  are  actually  a  metric  on  M. 
A  brief  summary  of  what  we  will  accomplish  in  this  paper  in  comparison  with 
well-known  perturbation  theorems  with  the  metric  of  absolute  value  on  C  will 
be  conducted  in  §3.  Full  statements  of  these  well-known  theorems  are  presented 
in  §3.  We  devote  two  sections  to  present  and  discuss  our  theorems.  §5  handles 
nonnegative  definite  cases,  singular  value  problems  and  graded  cases,  while  §6 
handles  the  rest  of  the  perturbations  listed  in  §1.1  and  singular  value  problems 
again  for  comparison  purpose.  In  §7,  we  give  a  brief  account  of  established 
theorems  related  to  our  relative  perturbation  theorems.  We  will  briefly  remark 
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how  our  relative  perturbation  theorems  can  be  applied  to  generalized  eigen¬ 
value  problems  and  generalized  singular  value  problems.  Finally,  our  proofs  of 
theorems  are  presented  in  § §9 — 12. 
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2  Relative  Distances 


2.1  The  p-Relative  Distance 

Given  a,  (3  £  C,  the  p-relaiive  distance  between  them  is  defined  as 

RelDist„(a,  (3)  d=  )a  ~  ^  (2.1) 

p  vnp  +  mp 

def 

where  1  <  p  <  oo.  We  define,  for  convenience,  0/0  =  0.  RelDistoo  was  first 
used  by  Deift,  Demmel,  Li,  and  Tomei  [6]  for  defining  relative  gaps. 

Proposition  2.1  Let  1  <  p  <  oo  and  a,  [3  £  C. 

1.  RelDistp(a,  /?)  >  0  and  the  equality  sign  holds  if  and  only  if  a  =  f3; 

2.  RelDistp(a,  (3)  =  RelDistp(/3,  a); 

3.  RelDistp(^a,  f/3)  =  RelDistp(a,  (3)  for  all  0  yt  f  £  C; 

4-  RelDistp(l/a,  1/(3)  =  RelDistp(a,  (3)  for  a  yt  0  and/3^0; 

5.  RelDistp(a, /?)  <  21_1/,p  and  the  equality  sign  holds  if  and  only  if  a  = 
-fd  ±  0; 

6.  RelDistp(a,  0)  =  1  if  a  yt  0;  RelDistp(a,  j3)  >  1  forp  >  1  and  RelDisti(a,  (3) 
1,  if  ex[3  <  0;  Finally,  RelDistp(a,  (3)  <  1  for  all  p  if  ex[3  >  0. 

1.  RelDistp(a,  (3)  increases  as  p  does. 

8.  if  a,  a i,  /?,/?i  £  M.  and  a  <  a\  <  (3\  <  (3  and  ai(3\  >  0,  then 


RelDistp(a,  (3)  >  RelDistp(ai, /?i).  (2-2) 


Moreover  if  either  a  <  ex\  or  (3\  <  (3  holds,  the  inequality  (2.2)  is  strict. 

Proof:  Properties  1-6  are  trivial.  Property  7  holds  because  //|a|P  +  \(3\p  is  a 
decreasing  function  of  p  for  1  <  p  <  oo.  To  prove  Property  8,  it  suffices  to  show 
that 

RelDistp(a,  (3)  >  RelDistp(a, /?i),  (2-3) 

where  a  <  (3\  <  (3  and  ex(3 1  >  0.  Consider  function  /(£)  defined  by 


f(0  = 


</i+W’ 


where  —  1  <  £  <  1. 


We  claim  that  the  function  /(£)  so  dehned  is  strictly  monotonically  decreasing. 


Hpf 

This  is  true  if  p  =  oo.  When  p  <  oo,  set  h(()  =  [/(f;)]p.  Because  for  0  <  £  <  1 


m  =  - 


p(i-Op-1(i  +  gp~1) 

(i+^)2 


<  0, 
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/(£)  is  strictly  monotonically  decreasing  for  0  <  £  <  1.  For 
g(0  =f  /*(—£)•  Since  for  0  <  £  <  1 


g'(0 


(i  +  ^)2 


1  <  £  <  0,  set 


g(fi)  is  strictly  monotonically  increasing  for  0  <  £  <  1,  and  thus  h(fi)  and  /(£) 
is  strictly  monotonically  decreasing  for  —1  <  £  <  0.  This  completes  the  proof 
of  that  the  function  /(£)  is  strictly  monotonically  decreasing.  There  are  several 
cases  to  deal  with  in  order  to  prove  (2.3). 

1.  if  a  >  0,  then  0  <  a/ fi  <  a/fii  <  1  and 


RelDistp(a, /?)  =  f(a/fi)  >  f  (a/fii)  =  RelDistp(a, /?i); 


2.  if  fi  <  0,  then  0  <  fi/a  <  /3i/a  <  1  and 

RelDistp(a,  fi)  =  f(fi/a )  >  f(f3i/a)  =  RelDistp(a, /?i); 

3.  if  /?i  <  0  <  [3,  then  0  <  fti/a  <  1.  Let  <fo  be  the  one  of  a//3  and  j3 / a 
which  lies  in  [—1,0].  Now  if  a  =  /3i  =  0,  (2.3)  is  trivial;  otherwise  either 
a</3i<0</3ora</3i=0</3is  true,  and  thus  —  1  <  £o  <  0  < 
fii/ a  <  1,  so  we  have 

RelDistp(a,  fi)  =  /(^o)  >  f(fii/a)  =  RelDistp(a,  fii), 
as  was  to  be  shown. 

The  proof  of  Property  8  is  completed.  I 

Remark:  In  Property  8  of  Proposition  2.1,  the  assumption  a\fi\  >  0  is  es¬ 
sential.  This  can  be  seen  by  noting  that  for  fi  >  a  >  0,  —a  <  —a  <  a  <  fi 
while 

RelDist„(— a,  fi)  =  —  ^  <  21-1/,p  =  RelDist„(— a,  a). 

pv  ’  ^  a?  +  fiP  py  ’ 

Now,  we  introduce  another  global  notation  of  this  paper.  Henceforth  p  and 
q  are  reserved  for  a  dual  number  pair  as  defined  below 

— I —  =  1,  where  1  <  p  <  oo  and  1  <  q  <  oo. 

V  9 

In  general,  when  people  say  the  relative  perturbation  in  a  real  number  a  is 
at  most  e,  it  is  meant  that  a  is  perturbed  to  another  real  number  fi  in  the  sense 
that  if  we  write  fi  =  a(l  +  8)  then  6  E  M.  and  | <5 1  <  e  (see,  e.g.,  [8]),  which  is  also 
equivalently  to  say 

2-1  <, 

a 

So  it  would  be  interesting  to  relate  our  p-relative  distance  to  this  common  sense 
of  relative  perturbations. 
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Proposition  2.2  Let  0  <  e  <  1,  and  a,  [3  £  1.  We  have  the  following: 

<  e  =>•  RelDistp(a, /?)  <  e, 


£-f 


(2.4) 


RelDisti(a,  (3)  <  e 
RelDist2(a,  (3)  <  e 
RelDistoo(a,  (3)  <  e 
For  general  1  <  p  <  oo,  if  21lpe  <  1  we  have 
RelDistp(a,  /?)  <  e  =>■  max 
Asymptotically, 


£-1 

a 

) 

-  -  1 
P 

p—i 

a 

5 

-  -  1 

P 

a 

5 

-  -  1 

P 

< 


< 


< 


2e 

V2e 

e 

1~* 


a 

^_1 


< 


21/p  < 


1  -  21/Pe ' 


lim  RelDist p(a,l3)  =  ^1/p 


(3^a 


-  l 


(2.5) 

(2.6) 

(2.7) 

(2.8) 

(2.9) 


thus  (2.4),  (2.5),  (2.6)  and  (2.7),  (2.8)  are  at  least  asymptotically  sharp. 

Proof:  (2.4)  is  trivial  to  show  since  (3  —  a  =  a(l  +  8)  —  a  =  a6.  To  prove  (2.5), 
(2.6)  and  (2.7),  we  set  either  (  =  (3 / a  or  (  =  a/ (3.  Then  (  >  0.  It  follows  from 
the  left-hand  side  of  (2.5)  that 


k~l| 

€+1 


<  e  =H£  -  1|  <  e(£  +  1)  =  e(£  -  1)  +  2e. 


So  if  £  >  1,  one  deduces  (  —  1  <  jfri',  and  if  (  <  1  one  has  1  —  (  <  This 

completes  the  proof  of  (2.5).  The  proof  of  (2.7)  is  analogous.  So  is  that  of  (2.8) 
by  noting  that  21/,pRelDistp(a,  (3)  >  RelDistoo(a,  (3).  To  show  (2.6),  we  see  that 

the  left-hand  side  of  (2.6)  implies  =f  r)  <  e.  So 


(( - 1  r  =  r(r  + 1)  =>  r  -  , — ^  + 1  = 0 


1  —  T]2 


solving  which  gives 


^  l±r]y/2-r1 2  i  ^  1 

€  =  1-r,*  ^^-1  = 


±T]\/2  —  T]2  +  T]2 
1  —  T)2 
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Hence 


|£  _  ^  <  nV 2  -  i2  +  rj2  _  3  _  ^2  -  ry2  +  ry  <  e  _  ^ 


1  —  ry2 


1  —  T) 


1  +  T] 


1  -  e 


■\Z2~’P+’?  ;c 

1+17 


since  - — '  is  decreasing  for  0  <  ry  <  1. 


Proposition  2.3  Zet  a  =  a(l  +  i>i)  and  [3  =  [3(1  +  62).  If  |6;|  <  e  <  1,  then 


RelDistp(a,  /?) 
RelDistp  (a,  /?) 


- -  >  RelDistp(a,  f3)  > 

21/9e  ~ 

- -  >  RelDistp(a, /?)  > 


RelDistp(a,  (I) 

1+1 

RelDistp(a,  (I) 

1+1 


1  +  e  ’ 
21/qe 

T+T 


(2.10) 

(2.11) 


Proof:  We  will  only  provide  a  proof  of  (2.11).  Since  |a|(l  — e)<  |a|  <  |a|(l  +  e) 
and  \j3\(\  -  e)  <  \f3\  <  \/3\(l  +  e), 


RelDistp(a,  (I) 

\a  ~ 

f/\a\r  +  \j3\P 

> 

\a  -  (3\-  \a6i  -  (IS2\ 

f/\a\P  +  \(I\P(l  +  e) 

> 

\a-(I\-  f/\a\P  +  \{3\Pf/eq  +  eq 

f/\a\P+\(I\P(l  +  e) 

RelDistp(a,  j3)  2 1>qe 

1  +  e  1  +  e 

RelDistp(ci,  (I) 

< 

|a  —  /3|  +  |a<5i  —  f362  \ 

f/\a\P  +  \(I\P(l-e) 

< 

\a-/3\+  f/\a\P  +  \/3\Pf/eq  +  eq 

<f\cx\P+  \p\P{l  -e) 

= 

RelDistp(a,  j3)  |  2 1>qe 
l  —  e  +  l-e’ 

as  were  to  be  shown. 

■ 

Proposition  2.4  below  shows  how  to  bound  RelDistp(a2,  /32)  by  RelDistp(a,  (3), 
and  vice  versa. 


Proposition  2.4  Let  a,  [3  £  C.  For  1  <  p  <  00, 

RelDistp(a2,  f32)  <  2  RelDistp(a,  /3).  (2-12) 

If  moreover,  a,  [3  £  1  and  ex[3  >  0,  then 

RelDistp(a, /?)  <  RelDistp(a2,  (I2).  (2-13) 
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Proof:  There  is  nothing  to  prove  if  a  =  (3  =  0.  Assume  at  least  one  of  the  two 
is  not  zero. 


RelDistp(a2,  f32) 


|a2-/?2| 
(|afp  +  \(3\2Py/P 


< 


\a  +  (3\x  (\a\p  +  \(3\ pf!p  \a  -  (3\ 

(\a\2P  +  \l3\2Py/P  X  (\a\P  +  \j3\pylP 
21~1l2P(\a\2p  +  | f3\2Py/2P  x  21l2p(\a\2p  +  \/3\2p)^2p 
(\a\2P+  \(3\2py/p 


RelDistp(a,  (3) 


2  RelDistp(a,  (3) 


which  proves  (2.12).  To  prove  (2.13),  without  loss  of  any  generality,  we  may 
assume  a,  f3  >  0.  Notice  that  a  +  f3  >  ( a2p  +  /J2?)1/2?  and  ( ap  +  /3p)1/p  > 
( a2P  +  l32Pyl2P.  So 


RelDistp(a,  (3)  = 


< 


\a2  —  (32 1  (|a|2p  + |/3|2p)1/p 

(|a|2P  +  l/^pp)1/?  (a  +  /?)(|a|p  +  |/3|p)i/p 
RelDistp(a2,  (32), 


as  was  to  be  shown. 


Let  {ai,  •  •  • ,  a„ }  and  {aq,  •  •  • ,  a„ }  be  two  sequences  of  n  real  numbers  in 
ascending  (descending)  order  respectively,  i.e. , 


«!<•••<  a„,  ex\  <  ■  ■  ■  <  an,  (or  «!>•••>  an,  «!>•••>  an).  (2.14) 


Now  we  consider  some  partial  solutions  to  the  question:  What  are  the  best 
one-one  pairings  between  the  cti’s  and  the  ay ’s  under  certain  measures?. 

Proposition  2.5  If  all  a8-  ’s  and  ay ’s  are  nonnegative,  then 

max  RelDistp(a8',  ay)  =  min  max  RelDistp(a8',  aqoy), 

l<«'<n  r  l<i<n  y 


where  the  minimization  is  taken  over  all  permutations  r  o/{l,2,  •  •  -,n}. 

Proof:  For  any  permutation  r  of  {1,  2,  -  -  - ,  n},  the  idea  of  our  proof  is  to  con¬ 
struct  n  +  1  permutations  tj  such  that 

To  =  t,  t„  =  identity  permutation 

and  for  j  =  0,1, 2, •••,«  —  1 


max  RelDistp(a8', 

1  <  i  <  n 


ar  (i))  >  max  RelDistp(az-,  a7 

3  y  1  <i<n 


n(*'))- 
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The  construction  of  these  tj ’s  goes  as  follows:  Set  To  =  t .  Given  Tj  ,  if  Tj  (j  + 1)  = 
j  +  1,  set  Tj+i  =  Tj ;  otherwise  define 

ri+i(0  =  S  i  +  if  *  =  j  +  I, 

{  Tj(J  +  f)>  if  *  =  Tf\j  +  i). 

With  Property  8  in  Proposition  2.1,  it  is  easy  to  prove  by  induction  that  such 
constructed  7y’s  have  the  desired  properties.  I 

Remark.  Proposition  2.5  may  fail  if  not  all  of  the  ay’s  and  ay’s  are  of  the  same 
sign.  A  counterexample  is  as  follows:  n  =  2  and 

aq  =  —2  <  aq  =  1  and  aq  =  2  <  aq  =  4. 

Another  point  we  want  to  make  is  that  given  two  sequences  of  ay’s  and  ay’ s  as 
above,  generally  we  do  not  have 

n  n 

[RelDist2(ay ,  ay)]2  =  min^^  [RelDist2(ay ,  agy))]  .  (2-15) 

8  =  1  T  8  =  1 

(2.15)  may  even  fail  when  all  ay,  ay  >  0.  Here  is  a  counterexample:  n  =  2 
0  <C  oq  <C  aq  <C  aq  =  0:2/2  <C  oq  > 

where  aq  is  sufhciently  close  to  0,  and  aq  is  sufficiently  close  to  aq  which  is 
fixed.  Since  as  aq  — >■  0+  and  aq  —>■  a f 

[RelDist2(aq,  aq)]2  +  [RelDist2(aq,  aq)]2  — >■  1, 

[RelDist2(aq,  aq)]2  +  [RelDist2(aq,  aq)]2  — >■  Id — ■=, 

V  5 

(2.15)  must  fail  for  some  0  <  aq  <  aq  <  aq  =  ai?2 /2  <  aq-  But  we  still  have 
Proposition  2.6  below. 

Proposition  2.6  Let  ay ’s  and  ay ’s  be  as  described  above  and  in  ascending  or¬ 
der.  Assume  that  both  sequences  contain  exactly  k  negative  numbers  and  n  —  k 
positive  numbers,  i.e., 

aq  <  •  •  •  aq  <  0  <  aq+i  <  •  •  -an,  and  ai  <  •  •  •  aq  <  0  <  aq+i  <  •  •  -exn. 

Then  given  a  permutation  a  of  { 1,2,  •  •  -  ,n},  there  exists  another  permutation  r 
of  {1,  2,  •  •  • ,  n}  such  that 

1  <  r(j)  <  k  for  1  <  j  <  k 

and 

n  n 

^  [RelDist2(ay,  aqyy)]  >  ^  [RelDist2(ay ,  aqyy)]  . 

2=1  2=1 
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The  proof  of  this  proposition  depends  heavily  on  Property  6  of  Proposition  2.1. 
Let  k  be  an  positive  integers,  and  set 

^n  +  l  —  '  '  '  —  (%n-\-k  —  ^n  +  1  —  '  '  '  —  ex n-\-k  —  0* 

Appending  these  0’s  to  the  two  previous  sequences,  we  have  two  larger  sequences, 
each  of  which  has  at  least  k  zeros.  The  following  proposition  says  that  it  is 
always  better  to  pair  zeros  with  zeros. 


Proposition  2.7  Given  a  permutation  a  of  { 1,2, 
tation  t  of  {1,  2,  •  •  • ,  n}  such  that 


,  n  +  k},  there  is  a  permu- 


n-\-k 


^  [RelDist2(ai,aCT(i))]  >  ^  [RelDist2(ay ,  ar(i))]  • 


8=1  8=1 

A  combination  of  Propositions  2.6  and  2.7  illustrates  two  things: 

1.  It  is  always  better  to  pair  zeros  to  zeros  as  many  as  possible; 

2.  It  is  always  better  to  pair  numbers  to  these  of  the  same  signs  as  many  as 
possible. 

2.2  Barlow-Demmel-Veselic  Relative  Distance 

We  introduce  another  notion  of  relative  distance:  RelDist  which  is  defined  as 
follows. 

def  \ex  j3 1 


RelDist(a,  /?)  = 


V\<xF\ 


(2.16) 


We  treat  0/0  =  0  and  1/0  =  oo.  We  call  RelDist(a,  f3)  the  Barlow-Demmel- 
Veselic  Relative  Distance  between  a  and  [3  because  it  was  first  used  by  Barlow 
and  Demmel  [1]  and  Demmel  and  Veselic  [9]  for  defining  relative  gaps  between 

the  spectra  of  two  matrices.  Regarding  to  RelDist,  we  have 

Proposition  2.8  Let  a,  [3  £  C. 

1.  RelDist(a,  f3)  >  0  and  the  equality  sign  holds  if  and  only  if  a  =  f3; 

2.  RelDist(a,  /?)  =  RelDist(/3,  a); 

3.  RelDist(<fa,  f/3)  =  RelDist(a,  /?)  for  all  0  ^  £  C; 

4-  RelDist(l/a,  1  / (3)  =  RelDist(a,  /?)  for  a  yl  0  and  f3  yl  0; 

5.  RelDist(a,  0)  =  00  if  ex  0; 
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6.  if  a,  ax,  fi,  fix  £l  and  a  <  ax  <  fix  <  fi  and  a[3  >  0,  then 

RelDist(a, /?)  >  RelDist(aq ,  fix).  (2-17) 

Proof:  Properties  1-5  are  trivial.  To  prove  Property  6,  it  suffices  to  show  that 

RelDist(a, /?)  >  RelDist(a,  fix),  (2-18) 

where  0  <  a  <  /?i  <  /3.  Since  the  function  ^  —  (  for  0  <  (  <  1  is  monotonically 
decreasing  and  0  <  a/ [3  <  affix  <  1, 

RelDist(a,  fi)  =  — -j=  —  \faffi  >  — ,  —  \/af fix  =  RelDist(a,  fix), 

xja/fi  \Julfi\ 


as  was  to  be  shown. 


Remark:  In  Property  6  of  Proposition  2.8,  the  assumption  afi  >  0  is  essential, 
since  the  inequality  (2.17)  is  clearly  violated  if  a  <  0  <  ax  <  fix  <  fi  and  ax  is 
sufficiently  close  to  0. 

As  before,  let  us  relate  Barlow-Demmel-Veselic  relative  distance  to  the  com¬ 
mon  sense  of  relative  perturbations. 


Proposition  2.9  Let  a,  fi  £  M.  If  0  <  e  <  1,  then 


P 


-  1 


<  e  =>•  RelDist(a,  fi)  < 


v/1  “ 


(2.19) 


if  0  <  e  <  2,  then 

RelDist(a,  fi)  <  e  =>■  max 
Asymptotically, 


-  1 


-  1 


<'2  +  V1  +  t  £-  (2-20) 


RelDist(a,  fi) 
iim  - H 1  =  1, 

/3  — a  £  _  1 

a 

thus  (2.19)  and  (2.20)  are  at  least  asymptotically  sharp. 

Proof:  The  left-hand  side  of  (2.19)  implies  fi  =  a(l  +  8)  for  some  6  E  M.  with 
|5|  <  e.  So 


RelDist(a,  fi)  = 


|  Sc 


< 


\/a2(l  +  8)  ~  x/1-  e’ 
as  required.  To  prove  (2.20),  we  set  either  (  =  af fi  or  (  =  fi/a.  Since  e  <  2, 
(  >  0.  RelDist(a,  fi)  =f  r)  <  e  gives 

K-l| 


=  V  =>  (2  -  (2  +  V2)(  +  1  =  0, 


12 


solving  which  yields 


€ 


2  +  T)2  ±  J(2  +  n2)2  -4  ( n 

=  1+  [h 


V- 


Hence 


1 

2 


as  was  to  be  shown. 


e 


Proposition  2.10  Let  (3  =  [3(1  +  6).  Assume  that  \f3\  <  |a|  and  |6|  <  e  <  1, 
then 


RelDist(a,  (3) 
l^e 


>  RelDist(a,  (3)  > 


RelDist(a,  (3) 


1  +  e 


(2.21) 


Proof:  Since  \f3\(l  —  e)  <  \f3\  <  \f3\(l  +  e)  and  \f3/ex\  <  1, 


RelDist(a,  (3) 


RelDist(a,  (3) 


|q  -  P\  >  \g-  (3\-  \8(3\ 


<s^| 

IV 

\f \a(3\ 

1 

1 

a 

\/\a(3\  (1  +  e) 
RelDist(a,  (3) 

€ 

1  +  e 

\a-(3\  +  \6(3\ 

1  +  e’ 

\J\ex]3\ 

\a  ~  (3  +  e \/3\ 

V\a/3\  (!  -  0 

RelDist(a,  (3) 

€ 

+  - — . 

1-e  1-  e' 


as  required.  I 

Proposition  2.10,  in  contrast  to  Proposition  2.3,  only  provides  bounds  on  how 
RelDist  varies  when  one  of  its  arguments  smallest  in  magnitude  is  perturbed 
a  little.  Generally,  we  do  not  have  a  nice  inequality  like  (2.11)  for  RelDist. 
Following  the  lines  of  the  proof  above,  one  can  establish 


RelDist  (a,/?)  e  M  +  l/?K  GTiVG/  ^  RelDist(a,  j3) 


e  H  +  \p\ 

1  + e  \/wp\ 
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where  a  =  a(l  +  6i)  with  | <5i |  <  e.  So  the  ratio 
plays  a  crucial  role. 


M+M 


which  could  be  very  large 


Proposition  2.11  For  a,  [3  >  0, 

RelDist(a2,  f32)  >  2  RelDist(a, /?), 
and  the  equality  sign  holds  if  and  only  if  a  =  (3. 


Proof:  If  either  a  or  [3  is  zero,  no  proof  is  required.  Assume  both  are  positive. 

RelDist(a2, /32)  =  — i=-- —  ^  >  2^ —  ^  =  2  RelDist(a,  ff) 

\/a/3  \/a/3  \/a/3 

as  was  to  be  shown.  I 

Again  there  is  no  universal  constant  c  >  0  so  that  RelDist(a,  f3)  is  bounded  by 
c  x  RelDist(a2, /32),  unlike  (2.13)  in  Proposition  2.4.  One  can  always  bound 
RelDistp  by  RelDist,  but  not  the  other  way  around. 

Proposition  2.12  For  a,  [3  £  C, 

RelDistp(a, /?)  <  2-1/,p  RelDist(a, /?), 
and  the  equality  sign  holds  if  and  only  if  \a\  =  \[3\. 

Proof:  Since 

\a\p  +  W  >  2^W  =  2  =>  </W\p  +  \!3\p  >  2 

from  which  the  inequality  follows.  I 

Proposition  2.12  is  useful  in  that,  as  we  will  see  later,  any  bound  with  RelDist 
yields  a  bound  with  RelDistp.  Now  consider  the  same  pairing  problem  for  this 

newly-defined  RelDist.  First  of  all,  the  conclusion  of  Proposition  2.7  clearly 
remains  valid  if  RelDist2  is  replaced  by  RelDist  because  of  Property  5  in  Propo¬ 
sition  2.8;  second,  with  the  help  of  Property  6  in  Proposition  2.8  we  can  prove 
the  same  conclusion  for  RelDist  as  that  for  RelDistp  in  Proposition  2.5. 

Proposition  2.13  Under  the  conditions  of  Proposition  2.5,  we  have 
max  RelDist(a8- ,  af)  =  min  max  RelDist(ay,  aTuf\, 

l<«'<n  r  Ki<n  y 


where  the  minimization  is  taken  over  all  permutations  r  o/{l,2,  •  •  -  ,n}. 


14 


Remark.  Proposition  2.13  may  fail  if  not  all  ay’s  and  ay’s  are  of  the  same  sign. 

A  counterexample  is  as  follows:  n  =  2  and 

ay  =  —1  <  ay  =  1  and  ct\  =  —  <  «2  =  2. 

We  have  showed  that  (2.15)  cannot  holds  generally.  In  what  follows,  we  will  see 
that  RelDist  can  do  better. 

Lemma  2.1  Let  0  <  ay  <  ay  and  0  <  ay  <  a-2.  Then 

|RelDist(ay ,  5i)J  +  |RelDist(ay,  a2)J  <  [RelDist(ay ,  a2)J  +  |RelDist(ay,  5i)J  , 
or  in  another  word, 

(a?i  -  ay)2  (S2  ~  a 2 )2  <  (S2  ~  Qi)2  (Si  ~ 

OL^OL^  Of  2<^1  C^iC^2 

and  the  equality  sign  holds  if  and  only  if  either  ay  =  ay  or  ay  =  a-2. 

Proof:  Complicated  algebraic  manipulations  show  that 

/(ay  -  ay)2  (S2  -  ay)2  (S2  -  ay)2  (Si  -  ay)2 
0:10:10:20:2  - — - 1 - — - — - — - 

y  CXlCXl  Ct  2^2  C%2&1  CX\CX  2 

=  —(a2  —  al)(a2  ~  a: i)(ayay  +  0:10:2)  <  0, 

and  the  equality  sign  holds  if  and  only  if  either  ay  =  «2  or  «i  =  «2.  I 

Armed  with  Lemma  2.1,  by  following  the  proof  of  Proposition  2.5,  one  can  show 
that 

Proposition  2.14  Let  {ay,  •  •  • ,  an}  and  {Si,---,Sn}  be  two  sequences  of  n 
positive  numbers  ordered  ascendingly  (descendingly)  as  in  (2.14).  Then 

|RelDist(oy,  S8-)j  =  min^^  |RelDist(ay,  Sr(yy)J  , 

8=1  8=1 

where  the  minimization  is  taken  over  all  permutations  r  o/{l,2,  •  •  -  ,n}. 

Remark.  It  is  clear  to  see  that  the  conclusion  of  Proposition  2.14  remain  valid 
if  we  weaken  the  conditions  by  only  assuming  that  ay’s  and  Sy ’s  are  nonnegative 
and  the  number  of  zeros  in  ay’s  equals  that  in  Sy ’s.  Proposition  2.14  may  fail  if 
not  all  ay’s  and  Sy’s  are  of  the  same  sign.  Here  is  a  counterexample:  n  =  2  and 

ay  =  —2  <  ay  =  1  and  Si  =  1  <  S2  =  2. 
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2.3  Are  RelDistp  and  RelDist  Metrics? 

Let  X  be  a  space.  Recall  that  a  function  d  :  IxXiefO,  oo)  is  called  a  metric 
if  it  has  the  following  three  properties:  for  a,  /3,  7  £l 

1.  d(a,  (3)  =  0  if  and  only  if  a  =  [3 ; 

2.  d(a,  f3 )  =  d(fi,  a); 

3.  d(a,  7)  <  d(a,  (3)  +  d(/3,  7). 

This  definition  excludes  immediately  the  possibility  that  RelDist  is  a  metric  on 
C,  nor  even  on  M  since  RelDist(a,0)  =  00  for  a  yt  0.  To  get  around  this,  we, 
as  any  mathematician  would  do,  extend  this  definition  of  a  metric  by  calling 
d  :  Xxlw  [0,  00]  a  generalized  metric  if  it  possesses  the  above  three  properties. 

Now  take  a  look  at  Propositions  2.1  and  2.8.  We  see  that  the  functions 
RelDistp  and  RelDist  on  C  x  C  satisfy  the  first  two  of  the  definition  of  a  (gen¬ 
eralized)  metric.  Naturally,  we  would  like  to  ask:  Is  RelDistp  a  metric  on  C? 

and  is  RelDist  a  generalized  metric  on  C?  Or,  equivalently,  we  may  ask  if  for 

a,  ft  7  e  c 


RelDistp(a,  7)  <  RelDistp(a,  (3)  +  RelDistp(/3,  7)?  (2.22) 

RelDist(a,7)  <  RelDist(a,  /?)  +  RelDist(/3, 7)?  (2.23) 

At  this  point,  we  are  able  to  formulate  our  incomplete  answers  into  Proposi¬ 
tion  2.15.  Since  the  proof  is  quite  long  and  tedious,  we  leave  it  to  Appendixs  A 
and  B.  Denote 

M>o  =f  [0,  00)  and  M+  =f  (0,  00). 


Proposition  2.15 

1.  (2.22)  holds  for  all  a,  /?,  7  >  0  and  1  <  p  <  00,  and  thus  RelDistp  is  a 
metric  on  M>o/ 

2.  (2.22)  with  p  =  1,2  or  00  holds  for  a,  /3,  7  E  M,  and  thus  RelDisti, 
RelDist2  and  RelDistoo  are  metrics  on  M; 

3.  (2.23)  holds  for  a,  /?,  7  >  0,  but  not  on  whole  M,  and  thus  RelDist  is  a 
generalized  metric  on  M>o,  but  not  on  M  nor  C. 

Still  the  question  whether  RelDistp  is  a  metric  on  C  is  open. 
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3  Summary  of  Results 

To  help  the  reader  to  grasp  quickly  what  we  have  accomplished  in  this  paper,  we 
give  here  a  table  to  summarize  partially  the  simplified  versions  of  our  theorems 
in  comparison  with  their  corresponding  well-known  theorems  in  literature.  Full 
statement  of  these  theorems  and  their  stronger  versions  will  be  done  in  §5  and 
§6.  More  results  will  be  discussed  in  §7.  Before  we  present  the  table,  let  us  stick 
to  some  notation:  A,  A  £  Cnxn,  and 

A(T)  =  {Ai ,  •  •  • ,  An}  and  A(T)  =  { Ai ,  •  •  • ,  An};  (3-1) 

B,  B  E  Cmxn,  and 

a(B)  =  {<7i,  •  •  -,an}  and  a(B)  =  {a1;  ■  ■  ■  ,anj.  (3.2) 

In  the  table,  r  always  stands  for  some  permutation  of  {1,  2,  •  •  • ,  n};  oy’s  and  cfj’s 
are  assumed  in  descending  order,  i.e. , 

ci  >  <t2  >  •  •  •  >  an  >  0,  (?i  >  (T2  >  •  •  •  >  (7„  >  0;  (3.3) 


Whenever,  all  A;’s  and  Aj ’s  are  real,  we  also  require 


Ai  >  A2  >  •  •  •  >  An ,  Ai  >  A2  >  •  •  •  >  An .  (3-4) 

In  Table  3.1,  each  row  consists  of  four  boxes.  The  first  one  describes  conditions 
under  which  the  inequality  in  the  second  box  holds;  the  third  one  states,  besides 
these  in  the  first  one,  additional  conditions  in  order  for  the  inequality  in  the 
fourth  box  to  be  true. 
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Table  3.1.  Summary:  (i)  Hoffman  and  Wielandt  Type  Theorems 


Classical  Bounds 

|  New  Relative  Bounds 

A 

and 

A 

normal 

A=  D*AD2 

y^E  1  A®  Ar(,)  |2 
<  \\A  -  A\\f 

(Theorem  4.1) 

[ffelDist2(A8,  Ar(i))| 

<  minj 

V/\\I-D1\\%  +  \\I-D-1\\%, 
V\\i-dT1\\2f  +  \U-D2\\2f} 

(Theorem  6.2) 

A 

and 

A 

Hermitian 

1=  D*  AD 

yf:iA8-A8i2 

<  \\A  -  A\\f 

(Theorems  4.1  and  4.3) 

^E  [ffelDist2(A8,  Ar(i))| 

<  VW-DWf  +  W-D-Tf 

(Theorem  6.3) 

A 

and 

A 

Definite 

1=  D*  AD 

^E  |a,-a8|2 

<  \\A  —  A\\f 

(Theorems  4.1  and  4.3) 

y^E  [ffdDSt(A8,A8)]2 

<  || D*  -D-1^ 

(Theorem  5.1) 

1  1 

<1  )<! 

II  II 

A=  DlAD2 

^E  |A,  -  Ar(i)  I2 
<  k(X)k(X)\\A  -  A\\f 

(Theorem  4.2) 

[ffelDist2(A8,  Ar(i))| 

<  K(X)fi:(X)min{ 

V/\\I-D1\\%  +  \\I-D-1\\%, 

V\U-dt1\\2f  +  \\i-D2\\2f} 

(Theorem  6.1) 

B 

and 

B 

B  =  D{BD2 

<  IIS -Blip 

(Theorem  4.7) 

^E:=i  [RelDist2 (cr;,  ?r(;))]  2  < 
[\\I-DlfF  +  \\I-D-1fF 
+  \\i  -  d2\\2f  +  /  -  D2  1  j] 

(Theorem  6.7) 

B 

and 

B 

B  =  D\BD2 

V/g>'"s,p 

<I|B-S||f 

(Theorem  4.7) 

^E:=1  [ffelDist)^,?,)] 

/  iTr-DrbiF+ipj-D,-1^ 

—  2 

(Theorem  5.2) 

Table  3.1.  Summary  (continued):  (ii)  Weyl-Lidskii  Type  Theorems 


Classical  Bounds 

New  Relative  Bounds 

A 

and 

A 

Hermitian 

|A;  —  A;|  <  ||A  —  A\\2 

(Theorem  4.3) 

A  =  D*  AD 

RelDist  oo  (A* ,  A* ) 

<  \\I-D*D\\2i 

RelDist (AM  A)  < 

(Cf.  (7.3)  and  (7.4)) 

A 

and 

A 

Definite 

|A  -  A|  <  ||A- A||2 

(Theorem  4.3) 

1=  D*  AD 

RelDist (AM  A)  <  || D*  -  D-1^ 

(Theorem5.1) 

A  =  AAA”1, 
A  =  AAA-1 

A  and  A  real 

nonnegative 

|Ai  -  Ai| 

<  k(X)k(X)\\A  -  A\\2 

(Theorems  4.4  and  4.5) 

A=D{AD2 

RelDistp(Aj ,  A8) 

<  K(A)fi:(A)min  ( 
i/WI-DlWl  +  WI-D^Wl, 
VII/- Drill +  ||/-D2|||} 

(Theorem  6.4) 

B 

and 

B 

\<Ti  -  <  \\B  -  B  2 

(Theorem  4.7) 

B  =  D{BD2 

RelDist p{(Ti,  <Ti)  <  min  ( 

W-  Drill  +  \\I-D2\\i 

V\\I  -  Dill!  +  \\I-  D-1  III} 

(Theorem  6.8) 

B 

and 

B 

\<Ti  -  <  \\B  -  B  2 

(Theorem  4.7) 

B  =  DlBD2 

RelDist (<Ti,  <Ti) 

„  11^7-^r1  IC+II^S-^T1  II2 

—  2 

(Theorem  5.2) 

Table  3.1.  Summary  (continued):  (iii)  A  Bauer-Fike  Type  Theorem 


Classical  Bounds 

New  Relative  Bounds 

A  =  AAA”1 

VA  e  A(A),  3A  e  A(A), 
such  that 

|A  —  A|  <  k(A)||A-A||2 

(Theorem  4.6) 

Either 

A  =  AD 

or 

A  =  DA. 

VA  e  A(A),  3A  e  A(A), 
such  that 

%^<K(A)||/-D||2 

(Theorem  6.6) 
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Finally,  let’s  consider  the  graded  case  for  which  we  will  use  H  =  D* AD  and 
H  =  D*  AD  for  two  n  x  n  graded  nonnegative  definite  Hermitian  matrices  with 
A  nonsingular  and  1 1  ^4— 1 1 1 2 1 1  A^4|  1 2  <  1,  where  A  A  =f  A  —  A,  and  G  =  BD 
and  G  =  BD  for  two  m  >  n  graded  matrices  whose  singular  values  are  of 
interest.  Also  it  is  required  that  B  is  nonsingular  and  ||B-1 1 1 2 1 1 AB||2  <  1  where 
A Bd=B-B.  Denote 

X(H)  =  { Ai ,  •  •  • ,  An}  and  A(7f)  =  { Ai ,  •  •  • ,  An}; 

and 

a(G)  =  {alr  ■  -  ,an}  and  a(G)  =  {3q,  •  •  • ,  anj, 
and  arrange  them  in  the  order  prescribed  by  (3.3)  and  (3.4).  Set 

Ea  =f  A-1/2(AA)A~1/2  and  EB  d=  {AB)B~1 . 


Table  3.1.  Summary  (continued):  (iv)  Theorems  for  Graded  Matrices 


Classical  Bounds 

New  Relative  Bounds 

H 

and 

H 

Definite 

H  =  D*  AD 
and 

H  =  D*  AD 

y^E  ^ 

<  \\h-h\\f 

(Theorem  4.1  and  4.3) 

y^E  [lR4Dist(Aj ,  Aj)|  2 
<  ||(/+Aa)1/2-(/+Aa)-1/2||f 

(Theorem  5.4) 

H 

and 

H 

Definite 

|A,-  -  Af|  <  \\H-H\\2 

(Theorem  4.3) 

H  =  D*  AD 
and 

H  =  D*  AD 

RelDist(AM  A;) 

<  ||(/+Ta)1/2-(/+Ta)-1/2||2 

(Theorem  5.4) 

G 

and 

G 

G  =  BD 
and 

G  =  BD 

y^E  Wi  -°i? 

<\\g-g\\f 

(Theorem  4.7) 

^E  [R-elDist  (cr; ,  ?;)| 

^  ll(-r+-EB)*-(i+-EB)-1||B 

—  2 

(Theorem  5.3) 

G 

and 

G 

Wi  ~  Vi  | 

<||G-G||2 

(Theorem  4.7) 

G  =  BD 
and 

G  =  BD 

RelDist(cr8,  <j% ) 

/  ||(I+iSB)*-(I+iSB)-l||2 

—  2 

(Theorem  5.3) 
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4  Known  Perturbation  Theorems  for  Eigenvalue 
and  Singular  Value  Variations 

In  this  section,  we  will  briefly  review  a  few  most  celebrated  theorems  for  eigen¬ 
value  and  singular  value  variations  which  will  be  generalized.  Most  of  this 
theorems  can  be  found  in  Bhatia  [3],  Golub  and  Van  Loan  [14],  Parlett  [28] 
and  Stewart  and  Sun  [30].  Notation  introduced  in  §3  will  be  followed  strictly. 
Hoffman  and  Wielandt  [16]  proved 

Theorem  4.1  (Hoffman-Wielandt)  If  A  and  A  are  normal,  then  there  is  a 
permutation  r  of  {1, 2,  •  •  • ,  n}  such  that 


\ 


Y,  IV  -  Ar(j)P  <  ||.4  -  A\\f. 


8  =  1 


For  a  nonsingular  matrix  X  £  Cnxn,  the  (spectral)  condition  number  k(X)  is 
defined  as 

k(X)  =f  |  |=V  1 1 2 1 1 1 1 2  - 

Theorem  4.1  was  generalized  by  Sun  [33]  and  Zhang  [37]  to  two  diagonalizable 
matrices. 


Theorem  4.2  (Sun-Zhang)  Assume  that  both  A  and  A  are  diagonalizable  and 
admit  the  following  decompositions 

A  =  XAX~1  and  A  =  X  AX~\  (4.1) 


where  X  and  X  are  nonsingular  and 

A  =  diag(Ai,  •  •  • ,  An)  and  A  =  diag(Ai,  •  •  • ,  An).  (4.2) 

Then  there  is  a  permutation  r  of  {1,  2,  •  •  • ,  n}  such  that 


n 


I  A*  -  Ar(i)  |2  <  k(X)k(X)\\A  —  A||ir. 


We  will  consider  unitarily  invariant  norms  |||  •  |||  of  matrices.  In  this  we  follow 
Mirsky  [27]  and  Stewart  &  Sun  [30].  To  say  that  the  norm  is  unitarily  invariant 
on  Cmxn  means  that  it  satisfies,  besides  the  usual  properties  of  any  norm,  also 

1.  \\\UXV\\\  =  |||V|||,  for  any  U  £  UJm,  and  V  £  UJn; 

2.  I V I  =  ||V||2,  for  any  V  £  Cmxn  with  rankV  =  1. 
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Two  unitarily  invariant  norms  used  frequently  are  the  spectral  norm  ||  •  ((2  and 
the  Frobemus  norm  ||  •  ||i? .  Let  |||  •  |||  be  a  unitarily  invariant  norm  living  in 
some  matrix  space,  the  following  inequalities  [30,  p.  80]  will  be  employed  very 
frequently  in  the  rest  this  paper. 

Ill^lll  <lffl|2  nil  and  |||YZ|||  <  |||Y|||||Z||2. 


Theorem  4.3  Suppose  that  A  and  A  are  both  Hermitian,  and  that  (3.4)  holds. 
Then  for  any  unitarily  invariant  norm  |||  •  ||| 


diag(Ai  -  Ai,  •  •  •,  An 


A„) 


< 


A  —  A 


(4.3) 


The  inequality  (4.3)  was  proved  by  Weyl  [35]  for  the  spectral  norm,  by  Loewner 
[24]  and  as  a  corollary  of  Hoffman-Wielandt  theorem  [16]  for  the  Frobenius  norm 
and  by  Lidskii  [23],  Wielandt  [36]  and  Mirsky  [27]  for  all  unitarily  invariant 
norms.  Neither  Lidskii  nor  Wielandt  mentioned  explicitly  (4.3)  which  was  done 
by  Mirsky  [27].  For  more  detail,  the  reader  is  referred  to  Bhatia  [3].  Theorem  4.3 
has  been  generalized  in  many  aspects.  The  following  theorem  is  due  to  Bhatia, 
Davis  and  Kittaneh  [4]. 

Theorem  4.4  (Kahan,  Bhatia,  Davis  and  Kittaneh)  To  the  hypotheses  of 
Theorem  4-2  adds  this:  all  A ; ’s  and  A j ’s  are  real  and  are  arranged  descendmgly 
as  in  (3.4).  Then  for  any  unitarily  invariant  norm  |||  •  ||| 


diag(Ai  -  Ai,  •  •  • ,  An 


<  k(X)k(X) 


A  —  A 


(4.4) 


The  inequality  (4.4)  was  proved  by  Kahan  [19]  for  the  spectral  norm,  as  a 
corollary  of  Sun-Zhang  theorem  [33,  37]  for  the  Frobenius  norm.  In  another 
aspect,  the  inequality  (4.3)  for  the  spectral  norm  was  generalized  to  £p  operator 
norm.  The  p-Holder  norm  of  a  vector  x  =  (£;)  £  Cn  is  defined  by 


n 


The  7p-operator  norm  of  a  matrix  X  £  Cnxn  is  defined  by 

||X||p  =  max  \\Xx\\p. 

\\x\\p  =  l 


If  X  is  nonsingular,  its  £p  condition  number  is  defined  by 

kp(X)  =f  ||X||p||X-1||p. 

Clearly,  k2(X)  =  k(X),  the  (spectral)  condition  number.  The  following  theorem 
is  due  to  Li  [21,  pp.  225-226]. 
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Theorem  4.5  (Li)  Under  ihe  conditions  of  Theorem  4-4-  Then 


max  |A;  -  Ai|  <  k.p(X)k.p(X)\\ A  -  A\\p, 


where  1  <  p  <  oo. 

Generally,  if  one  of  A  and  B  is  diagonalizable,  we  have  the  following  result 
due  to  Bauer  and  Fike1  [2]. 

Theorem  4.6  (Bauer-Fike)  Assume  A  is  diagonalizable,  i.e., 

A  =  XAX-1 ,  where  A  =  diag(Ai,  •  •  • ,  An). 

Then  for  any  A  £  A(T),  there  exists  a  A  £  A(T)  such  that 

|A-A|  <  k(X)\\A- A\\2.  (4.5) 

Regarding  singular  value  perturbations,  the  following  theorem  was  estab¬ 
lished  in  Mirsky  [27],  based  on  Lidskii  [23]  and  Wielandt  [36]. 


Theorem  4.7  Arrange  the  singular  values  of  B  and  B  in  descending  order  as 
in  (3.3).  Then  for  any  unitarily  invariant  norm  |||  •  ||| 


ll|diag(<7 1  -  <7i,  •  •  • ,  <7n  -  <r„ )  HI  < 


B-B 


(4.6) 


1One  can  prove  a  slightly  more  stronger  inequality  than  (4.5) 


|A  — A|  <  WX-^A-AjXWz. 
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5  Statement  of  Theorems  with  RelDist:  Non¬ 
negative  Definite  Matrices 

In  this  section,  we  devote  our  attention  to  the  relative  perturbation  theory  for 
eigenvalues  of  nonnegative  definite  matrices,  including  singular  value  problems. 
We  will  consider  the  following  problems: 

•  Eigenvalue  problems: 

1.  A  and  A  =  D*  AD  with  A  nonnegative  definite  and  D  being  close  to 
some  unitary  matrix; 

2.  H  =  D*  AD  and  H  =  D*  AD  with  both  A  and  A  positive  definite 
and  ||j4_1||2||j4  —  A\\i  <  1,  where  D  is  some  square  matrix. 

•  Singular  value  problems: 

1.  B  and  B  =  D(BD2  with  D\  and  D2  being  close  to  some  unitary 
matrices  of  suitable  dimensions; 

2.  G  =  BD  and  G  =  BD  with  both  B  and  B  nonsingular  and  ||B_1  H2II-B— 
B 1 1 2  <  1,  where  D  is  some  square  matrix. 

Theorems  presented  here  are  often  better  than  these  in  the  next  section  when 
applying  to  nonnegative  definite  matrices.  We  will  make  this  more  concrete  in 
the  coming  section. 


5.1  Eigenvalue  Variations  for  A  and  A  =  D*AD 

Theorem  5.1  Let  A  and  A  =  D*  AD  be  two  n  x  n  Hermitian  matrices,  where 
D  is  nonsingular.  Denote  their  eigenvalues  as  in  (3.1)  and  arrange  them  de- 
scendmgly  as  described  m  (3.4).  Assume  that  A  is  nonnegative  definite.  Then 

<  \\D*-D~%,  (5.1) 

<  \\D*  -D-'Wf.  (5.2) 

It  is  trivial  to  relate  the  right-hand  sides  of  the  inequalities  (5.1)  and  (5.2)  to 
the  singular  values  of  D.  In  fact,  let  SVD  of  D  be 

D  =  UdAiVA  (5.3) 

One  has  for  any  unitarily  invariant  norm  |||  •  ||| 

Iff?*  -D-1 1  =  |||l/d(Ed-E-1)i7d*|||  =  |||Ed-E-1|||. 


max  RelDist(As  ,  Aj) 

1  <  i  <  n 


.  ^2  RelDist(Aj,  A*) 
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Another  point  we  would  like  to  make  is  that  A  and  D*  AD  have  the  same  rank, 
or  in  another  word,  A  and  D*  AD  have  the  same  number  of  zero  eigenvalues.  In 
order  for  the  inequalities  (5.2)  and  (5.1)  to  be  true,  0  eigenvalues,  if  any,  must 
be  always  paired  with  0  ones. 


5.2  Singular  Value  Variations  for  B  and  B  =  D)BD2 


Theorem  5.2  Let  B  and  B  =  D\BD2  be  two  m  x  n  matrices,  where  D\  and 
D2  are  square  and  nonsingular.  Denote  their  singular  values  as  in  (3.2)  and 
arrange  them  as  in  (3.3).  Then 


max  RelDist(<78',  oy)  < 

1  <  i  <  n 


\ 


RelDist(<78',  <t;) 


< 


8  =  1 


^(ll^-^r'lh  +  ii^-^ih),  (5.4) 

^(WDI-D^Wf  +  WD^-D^Wf).  (5.5) 


Now,  Let’s  briefly  mention  a  possible  application  of  Theorem  5.2.  It  has  some¬ 
thing  to  do  with  deflation  in  computing  the  singular  value  systems  of  a  bidi¬ 
agonal  matrix.  For  more  details,  the  reader  is  referred  to  [6,  8,  10,  26].  We 
formulate  the  application  into  a  corollary. 

Corollary  5.1  Assume  in  Theorem  5.2,  one  of  the  D\  and  D2  is  an  identity 
matrix  and  the  other  takes  the  form 


D  = 


I  X 
I 


where  X  is  a  matrix  of  suitable  dimensions.  With  the  notation  of  Theorem  5.2, 
we  have 


max  RelDist(<78',  oy)  <  i||X||2, 

1  <  i  <  n  2 


^  2  1 

^  [RelDist(<r, ■,?,■)]  <  -j=\ |X||F. 


Proof:  Notice  that 


D*  -  D-1  = 


I 

X*  I 


I  -X 

I 


X 


X* 


and  thus  || D*  -  D~%  =  ||X||2  and  ||D*  -  D^Wf  =  V^\\X\\F. 
It  was  proved  by  Eisenstat  and  Ipsen  [10]  that 


|<T;  —  <7; |  <  ||A||2(Tj',  or  equivalently 


-  1 


(5.6) 

(5.7) 


(5.8) 
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So  as  long  as  3y  and  <t;  are  of  the  similar  magnitude  which  is  guaranteed  if  ||X||2 
is  small,  our  inequality  (5.6)  is  sharper  by  a  factor  1/2.  As  a  matter  of  fact,  it 
follows  from  (5.6)  and  Proposition  2.9  that  if  ||X||2  <  4  then 


—  -  1 
Vi 


< 


+  \/i+ 


\\x\\i\  imh  _  imh 

16 


■o(\\x\\l). 


Our  inequality  (5.7)  is  the  first  of  its  kind. 


5.3  Graded  Matrices 

Theorem  5.3  Let  G  =  BD  and  G  =  BD  be  two  n  x  n  matrices,  where  B  and 
B  are  nonsingular,  and  let  A B  =  B  —  B.  Denote 


v{G)  =  {<71,  •  •  • ,  an}  and  a(G)  =  {a1}  ■  ■  ■ ,  an}, 


and  arrange  them  descendmgly  as  in  (3.3).  //||AB||2||B  1 1|2  <  1,  then 


max  RelDist (az,  az) 

1  <i<n 

<  i  || (/  +  (A B)B~1r  -  (I  +  (A B)B-1)-1 1|2 

( \\(AB)B~1  +  B~*(AB)*\\2  \\(AB)B~1 1|2  /  \\(AB)B~1\\2 

-  \  ||(AB)B-1||2  +  1-\\(AB)B-i\\2)  2 

<  ( i  |  i  ^  Wb-'Uabw, 

-  \  +  1-\\B-i\\2\\AB\\2)  2 


(5.9) 


y  j^RelDist(<Ti,  <Ti)| 

1  =  1 

<  \  ||U  +  (A B)B~1r  -  (I  +  (AB)B-1)-1  ||F 

(\\(AB)B-1+B-*(ABy\\F  \\(AB)B-1\\2 

-  \  \\(AB)B-i\\f  +  1-||(A5)5-1||2 

<  L  1  ^  ||i?-1||2||Aj?||J, 

-  V  +  l-||5-i||2||A5||2;  2 


||(AB)ir1||f 

2 


(5.10) 


Remark.  It  is  interesting  to  notice  that  if  (A B)B  1  is  very  skew,  then 
RelDist(<78',  Vi)  =  o(||(AB)B_1  H2).  Especially  if  ||(AB)B_1  +  B_*(AB)*||2  = 
0(||(AB)B-1  HI),  then  RdDist^-,  v()  =  0( 1 1 ( AB)B“ 1 1 1 1)  also. 

Theorem  5.4  Let  H  =  D*  AD  and  H  =  D* AD  be  two  n  x  n  nonnegative 
definite  Hermitian  matrices  whose  eigenvalues  are 

\(H)  =  {\lr--,\n}  and  =  (5.11) 
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and  in  descending  order  as  in  (3.4)  and  let  A  A  =  A  —  A.  If 


||A-1||2||AA||2<1, 


(5.12) 


then 


max  RelDist(As  ,  Aj) 

1  <  i  <  n 


< 

< 


j(I  +  A-1/2(AA)A~1/2)1/2  -  (1  +  A-1/2(AA)A~1/2)-1/2 

\\a-%\\aa\\2 
V1  -  Wa-^UWaaWi’ 


(5.13) 


\ 


n 

RelDist(Aj,  A*) 


2 


< 

< 


j(I  +  A-1/2(AA)A~1/2)1/2  -  (I  +  A-1/2(AA)A~1/2)-1/2 

\\a-%\\aa\\f 

^i-\\a-i\\2\\aa\\2- 


(5.14) 
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6  Statement  of  Theorems  with  RelDist^ 

The  rests  of  cases  listed  in  §1.1,  as  well  as  singular  value  problems,  will  be 
treated  here.  To  be  specific,  we  will  consider 

•  Eigenvalue  problems: 

1.  A  and  A  =  D* AD  for  Hermitian  case,  where  D  is  nonsingular  and 
close  to  I  or  more  generally  to  a  unitary  matrix; 

2.  A  and  A  =  D\AD2  for  general  diagonalizable  case,  where  D\  and 
D2  are  nonsingular  and  close  to  I  or  more  generally  to  some  unitary 
matrix; 

•  Singular  value  problems: 

1.  B  and  B  =  D\BD 2,  where  D\  and  D2  are  nonsingular  and  close  to 
I  or  more  generally  to  two  unitary  matrices; 

We  retreat  singular  value  problems  for  comparison  purpose.  As  we  will  see 
soon  that  we  will  prove  more  nice  inequalities  for  singular  value  variations, 
but  these  inequalities  may  be  potentially  less  sharp  than  those  in  §5  for  large 
perturbations.  Brief  comparisons  among  theorems  in  this  section  and  these  in 
the  previous  section  will  be  given. 

6.1  Eigenvalue  Variations 

The  following  theorem  is  a  generalization  of  Theorems  4.1  and  4.2. 

Theorem  6.1  Assume  that  n  x  n  matrix  A  is  perturbed  to  A  =  D\AD2  and 
both  D\  and  D2  are  nonsingular.  Assume  also  both  A  and  A  are  diagonalizable 
and  admit  the  decompositions  as  described  in  (4-1)  and  (4-S).  Then  there  is  a 
permutation  r  of  {1,  2,  •  •  • ,  n}  such  that 


N 


T;  ^RelDist2(A;,  Ar(q)| 
1  =  1 


(6.1) 


<  min  jj| A-1 1|2 ||X||2  Y  HX-1)/  -  D2)XfF  +  || X~i(Dr  ~  T)XfF, 

II^IMIxii^ha-h/  -  d*)x\\%  +  || x-HDf1  -  i)x ||2F  j 

<  <X)k(X)  min  |  y/\\I  -  D1  ||2F  +  ||7  -  Df1  ||2F,  yf  \\I  -  Df1  ||2F  +  ||7  -  D2\\%  J  . 
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For  any  given  U  £E  UTn,  UAU*  =  (D\U*  )*  AD2U*  has  the  same  eigenvalues  as 
A  does,  and  moreover  from  (4.1) 

UAU*  =  (XU^y^XU* . 

So  applying  Theorem  6. 1  to  matrices  A  and  U AU*  leads  to  the  following  theorem 
which  we  will  refer  as  Theorem  6.1s,  where  “s”  is  for  indicating  that  it  is  stronger. 

Theorem  6.1s  Let  all  conditions  of  Theorem  6.1  hold.  Then  there  is  a  permu¬ 
tation  t  of  {1,  2,  •  •  • ,  n}  such  that 


a  I  ^  ^  [RelDist2(A,-,  Ar(z-)) 
\  i  =  1 


(6.2) 


<  k(X)k(X)  min  min  \  yJ\\U  -  +  \\U*  -  D21\\2F, 


\/\\U*  ~  D^\\2f  +  \\U  -  D2\\2f]  . 


Suppose  now  A  E  Cn  is  an  normal  matrix,  i.e. ,  A*  A  =  AA* .  Perturb  A 
to  A  =  D\AD2-  The  question  is:  When  is  A  also  normal?  This  is  a  rather 
interesting  question,  and  an  instant  answer  is  that  A  is  normal  provided 

D*2A*  D1D*1AD2  =  DlAD2D*2A*  Di. 

However,  this  condition  is,  perhaps,  too  general  to  be  useful.  I  do  not  know  how 
to  approach  this  problem  yet  and  therefore  this  question  will  not  be  addressed 
further  in  what  follows.  On  the  other  hand,  if  we  happen  to  know  that  A  is  also 
normal,  the  following  theorem,  as  a  corollary  of  Theorem  6.1,  indicates  that  the 
eigenvalues  of  A  and  A  agrees  to  high  relative  accuracy. 

Theorem  6.2  Let  A  and  A  =  D\AD2  be  two  n  x  n  normal  matrixes,  where 
D\  and  D2  are  nonsingular.  Denote  their  eigenvalues  as  in  (3.1).  Then  there 
is  a  permutation  r  of  {1,  2,  •  •  • ,  n}  such  that 


[RelDist2(A8,  Ar(q)j  (6.3) 

8  =  1 

<  min  min  j^ll U-  D1\\%  +  \\U*~  Df1  \\2F,  y/\\ U*  -  D^\\%  +  \W  ~  ^||2F|  . 

We  happen  to  know  how  to  solve  the  minimization  problem:  find  a  ho  £  UIn 
such  that  for  any  unitarily  invariant  norm  |||  •  ||| 

min  11117  —  Hill  =  III —  -Dill  and  min  |||l7*  —  D-1  III  =  III C/n  —  D-1  III  .  (6.4) 

.  .  m  i  / 
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in  terms  of  the  singular  value  decomposition  (SVD)  of  D.  As  a  matter  of  fact, 
let  SVD  of  D  be  given  in  (5.3).  It  follows  from  Theorem  4.7  that 

1 17  -  D\\  >  |||/-  Ed I  and  |||V*  -  D~x  |||  >  \\\l  —  Ej1 1||  .  (6.5) 

•  dcf  •  •  •  • 

Fortunately,  there  is  one  IJo  =  which  realizes  the  two  equality  signs. 

Theorem  6.2,  now  applying  to  Hermitian  matrices,  leads  to 


Theorem  6.3  Let  A  and  A  =  D*  AD  be  two  n  x  n  Hermitian  matrices,  where 
D  is  nonsingular.  Denote  their  eigenvalues  as  in  (3.1).  Then  there  is  a  permu¬ 
tation  t  of  {1,  2,  •  •  • ,  n}  such  that 


\ 


^  ^  RelDist2(Az- ,  Ar(z-)) 


=  A/||7-Ed||^  +  ||7-E-1||^.  (6.6) 


It  is  worth  mentioning  that  the  permutation  r  in  Theorem  6.3  may  not  be  the 
identity  one,  assuming  eigenvalues  are  ordered  in  the  way  of  (3.4).  However,  one 
can  always  choose  a  r  such  that  zeros  are  matched  to  zeros,  negative  eigenvalues 
to  negative  ones  and  positive  ones  to  positive  ones  (Cf.  Propositions  2.6  and 
2.7).  A  brief  comparison  of  this  theorem  and  the  inequality  (5.2)  in  Theorem  5.1 
leads  to  the  following  conclusions: 

1.  Theorem  6.3  covers  both  the  definite  case  and  the  indefinite  case,  while 
the  inequality  (5.2)  in  Theorem  5.1  covers  the  definite  case  only; 

2.  When  applying  to  the  definite  case,  (5.2)  is  sharper  than  (6.6).  As  a 
matter  of  fact,  (6.6)  is  a  corollary  of  (5.2).  It  follows  from  (5.2)  and 
Proposition  2.12  that  if  A  is  nonnegative  definite 


\ 


y;  RelDist2(A;,  A;) 


< 


8  =  1 


< 

< 


|RelDist(Aa' ,  Aj) 

8  =  1 

i/||7-Ed||^  +  ||7-E 


d1!!^ 


by  Lemma  6.1  below. 

Lemma  6.1 

^l|£d  -  Sd'lk  <  ^||/-sd||^  +  ||/-Ed-1||^, 

and  the  eguahty  holds  if  and  only  */Sd  =  I,  i.e.,  D  is  unitary. 
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Proof:  Notice  that  for 


£- 


< 


£  -  1  +  1 - 


<V2\\Z-1\2  + 


1  - 


and  the  equality  sign  holds  if  and  only  if  £  =  1. 


The  theorem  below  is  a  generalization  of  Theorems  4.3  and  4.4  for  the  spectral 
norm  and  that  of  Theorem  4.5. 


Theorem  6.4  To  ihe  hypotheses  of  Theorem  6.1  adds  this:  all  X i’s  and  X j ’s 
are  nonnegative  and  are  arranged  descendmgly  as  described  in  (3.4).  Then  we 
have 


max  RelDistp(Aj,  Aj)  <  Kr(X)Kr(X)  min 

1  <  i  <  n 


\i-d*\\U\\i-d, 


-i\\q 


d\\I-Df*\\l  +  \\I-D2\\l\  ,  (6.7) 


where  1  <  r  <  oo. 

Similarly  to  Theorem  6.1,  there  is  a  stronger  version  of  this  theorem  as  follows. 
Theorem  6.4s  Let  all  conditions  of  Theorem  6.4  hold.  Then 
max  RelDistp(Aj,  A*)  <  Kr(X)Kr(X)x  (6-8) 

l<i<n 

mm  mm  j  \\U  -  £>i||«  +  \\U*  -  D^\\\,  i/\\U*  ~  1 1 1 1  +  W  ~  D2\\d  . 


As  a  consequence  of  this  theorem  and  the  solution  (6.5)  to  the  optimization 
problem  (6.4),  we  deduce  that 

Theorem  6.5  Under  the  conditions  of  Theorem  6.3,  if  A  is  nonnegative  defi¬ 
nite  and  the  eigenvalues  of  A  and  A  are  in  descending  order  as  m  (3.4),  then 

max  RelDistp(Aj' , \)  =  ^||7  -  Ed|||  +  ||7  -  EJ1|||,  (6.9) 

where  Ed  is  defined  m  (5.3). 

However,  there  is  not  much  interest  in  this  theorem  for  two  reasons:  One  is  that 
(6.9)  works  for  nonnegative  definite  matrices  only  just  like  the  inequality  (5.1) 
of  Theorem  5.1;  and  the  other  is  that  (6.9)  is  less  sharper  than  (5.1).  To  see 
this,  we  notice  that  (5.1)  and  Proposition  2.12  imply  that 

max  RelDistp(Aj',  Aj-)  <  2_1/,pRelDist(As',  As)  <  2-1/,p||Ed  —  ET1)^. 

1  <  7  <  n 

So  with  Lemma  6.2  below,  one  can  deduce  (6.9)  from  (5.1).  But  still  (6.9)  looks 
nice  and  clean. 
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Lemma  6.2 


lSd  -  S'  ||2  ^ 


--  1 1 1  3 


<  21/,py||7  —  Ed|||  +  ||7  —  Ej  ll2, 
and  the  equality  holds  if  and  only  tfT,, j  =  7,  z.e.,  77  is  unitary. 


Proof:  Let  f  £  <r(77)  so  that  ||E<j  —  Ej1)^  = 


€- 


.  Then 


<  l£-l|  + 


'-I 


<  21/p{  + 


1  - 


1 


<  21/^||7-Ed||l  +  ||7-Ed-1||l, 


(6.10) 


as  required.  I 

So  far  we  have  considered  the  case  when  both  A  and  A  are  diagonalizable.  In 
what  follows,  we  weaken  this  assumption  by  requiring  only  A  to  be  diagonaliz¬ 
able  and  derive  relative  eigenvalue  perturbation  bounds  of  Bauer-Fike  Type  [2]. 

Theorem  6.6  Assume  that  A  £  Cnxn  is  diagonalizable  and  admits  the  follow¬ 
ing  decomposition 

A  =  VAX-1  where  A  =  diag(Ai ,  •  •  • ,  An).  (6-11) 

Assume?  also  either  A  =  DA  or  A  =  AD.  Then  for  any  A  £  A  (A)  there  exists 
a  A  £  A(A)  such  that 


min 

aga(a) 


|A  -  A| 

I'M 


<  \\X~\D  -  I)X\\p  <  Kp(X)\\I  -  D\\p. 


(6.12) 


6.2  Singular  Value  Variations 

As  to  singular  value  variations,  we  will  prove 

Theorem  6.7  Let  B  and  B  =  77(5772  be  two  m  x  n  matrices,  where  D\  and 
772  are  nonsingular.  Denote  their  singular  values  as  in  (3.2).  Then  there  is  a 
permutation  r  of  {1,  2,  •  •  • ,  n}  such  that 

n 

Y  [RelDist2(<7i,  <?r(i))] 

2  =  1 

<  ^=y/\\I  -  Di\\ 2F  +  P  -  D-^Wl  +  1 1 7  -  772|||i  +  1 1 7  -  772-1||^.(6.13) 

2Unlike  in  our  previous  theorems,  here  we  do  not  have  to  assume  that  D  is  nonsingular. 
Of  course,  if  D  is  far  away  from  I,  the  bound  (6.12)  does  not  tell  us  much;  if  D  is  close  enough 
to  I,  it  has  to  be  nonsingular. 
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For  any  given  U  E  UJm  and  V  E  UTn,  UBV*  =  (DiU*)* BD2V*  has  the  same 
singular  values  as  B  does.  Let  the  SVDs  of  D\  and  D2  be  as 

D1  =  U^nV^  and  D2  =  Ud2Zd2Vd*2.  (6.14) 

Applying  Theorem  6.7  to  matrices  B  and  UBV*,  together  with  the  solution 
(6.5)  to  the  optimization  problem  (6.4),  leads  to  the  following  stronger  version 
of  the  theorem. 

Theorem  6.7s  Let  all  conditions  of  Theorem  6.1  hold.  Then  there  is  a  permu¬ 
tation  t  of  {1,  2,  •  •  • ,  n}  such  that 


^  [RelDist2(<T8,  a>(8))] 2 
\|  ;=1 

<  7f  uaTveu,  -  Dl^  +  l|C/*  -  +  l|y  -  +  lly*  - 

=  -  Sdi  III*  +  ||/  -  £dlJ  \\2f  +  II I  -  Sd2  III.  +  11/  -  III,  (6.15) 

where  S<ji  and  £d2  are  defined  in  (6.1)). 

Theorems  6.7  and  6.7s  are  of  less  interest  since  they  provide  less  sharper  bounds 
than  Theorem  5.2  does.  We  keep  them  around  for  comparison  purpose,  though 
still  they  look  pretty.  Now,  we  are  going  to  show  how  to  derive  (6.15)  from  (5.5) 
of  Theorem  5.2.  It  follows  from  (5.5)  and  Proposition  2.12  that 


'll 

£ 

8  =  1 

|  [RelDist2(<7;,  ay)]2  <  -^= 

11  2 

T:  [RelDist(cri,3:i)| 

2  =  1 

< 

1 

2^2 

(IlSdi-Sdi'lk  +  P 

id2  -  S^IIf) 

< 

\(\ 

/||7-£dl|||  +  ||7-: 

^i\\2f  +  \/\\I-^12\\2f  +  \\I-^\\2f 

(by  Lemma  6.1) 

< 

/||7-£dl|||  +  ||7-: 

S-i1|^  +  ||7-Ed2||^  +  ||7-E-21||y 

which  shows  (6.15).  The  proof  in  §10  of  Theorem  6.7  is,  however,  of  different 
spirit. 

Theorem  6.8  Let  B  and  B  =  DIBD2  be  two  m  x  n  matrices,  where  D\  and 
D2  are  nonsingular.  Denote  their  singular  values  as  in  (3.2),  and  arrange  the 
singular  values  of  B  and  B  m  descending  order  respectively  as  m  (3.3).  Then 


33 


we  have  ihe  following 


max  RelDistp(<T;,  <T;)  <  min 

1  <  i  <  n 


II-D^WI  +  WI-D 


2112 


^||/-z?i||!  +  ||/-z?2-1H!  .  (e.ie) 


Similarly,  applying  Theorem  6.8  to  matrices  B  and  U BV* ,  we  will  have 
Theorem  6.8s  Let  all  conditions  of  Theorem  6.8  hold.  Then 


max  RelDistp(cr8,  <r8) 

l<t<n 


<  min  min 

ueUm,ve  lw 


U'-D^Wl  +  WV-D.Wl,  V\\U-D 


HI  +  ||W 


=  min 


||/  -  E-1  HI  +  ||7  -  Ed2  HI,  d\\I  -  Edl  HI  +  ||7  -  E 


(6.17) 


where  Edi  and  Ed2  are  defined  in  (6.14). 

We  can  not  say  for  sure  that  (5.4)  of  Theorem  5.2  is  always  sharper  than  the 
inequality  (6.17),  but  many  evidences  indicates  so.  Let’s  weaken  (6.17)  a  little 
bit  into 


max  RelDistp(<78',  oy) 

l<i<n 

<  i  (y ||7  -  E-1!!!  + 1 1 7  -  Ed2|||  +  (f\\i  -  Sdl|||  +  1 1 7  -  E-^ii^e.is) 

(6.18)  degrades  (6.17)  marginally  in  interesting  cases.  In  what  follows  we  will 
show  that  (6.18)  is  a  consequence  of  Theorem  5.2.  To  this  end,  let  (  £  cr(Di) 
and  (  £  <t(.D2)  so  that 


l^-^r1ll2  = 


£- 


and 


\D*2-D2% 


C 


1 

c 


We  notice  that 

RelDistp(<78',  erg)  <  2_1/,pRelDist((Tj',  cq) 


< 


< 


< 


1 


2  W/p 

1 

2  !  +  !/P 

1 

2i+i ~fp 


Edi  —  Sj 

k-ii+ 


l^-S^II,) 


(by  Proposition  2.12) 
(by  Theorem  5.2) 


c- 


1  - 


+  IC-1I  + 


1  - 


,  1 

*  J 

,  1 

ll«  + 

■  c 

+v 

1 

+  IC-1I 
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<  \  (#  -  Sdill!  + \\i-  +  i/\\i-^iX  +  \\i  -  Sd2|||)  , 

which  gives  (6.18). 
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7  A  Theorem  of  Ostrowski  and  Other  Theorems 


In  this  section,  we  briefly  review  the  current  state  of  research  on  the  problems 
listed  in  §1.1,  together  with  our  remarks. 

Let  A  be  an  n  x  n  Hermitian  matrix.  Perturbing  A  to  D*  AD,  where  D 
is  nonsingular,  is  actually  performing  a  congruence  transformation  to  A  by  D. 
The  following  theorem  is  due  to  Ostrowski  [17,  pp.  224-225]. 

Theorem  7.1  (Ostrowski)  Let  A,  D  £  Cnxn  with  A  Hermitian  and  D  non¬ 
singular.  Define  A  =  D*AD.  Denote  the  eigenvalues  of  A  and  A  as  in  (3.1) 
and  arrange  them  in  the  order  as  specified  by  (3.4).  Then  there  exist  Oj ’s  so 
that 

^min {(Tt)  A  Oj  A  (rmax(D)  and  A j  =  OjXj, 
for  j  =  1,2,  •  •  -,n. 

Ostrowski  theorem  implies  immediately  a  relative  perturbation  bound  on  Her¬ 
mitian  eigenvalues. 

Theorem  7.2  Let  the  conditions  of  Theorem  7.1  hold.  Then 

|Aj|~.|AjI  <||/-£>*£>||2, 

or  m  another  words, 

A j  =  Ay (1  +  8j)  with  |<5j|  <  || J  -  D*D\\2, 
for  j  =  1,2,  •  •  -,n. 

Although  the  inequality  (5.1)  of  Theorem  5.1  and  Theorem  7.2  are  independent 
in  the  sense  that  one  can  not  be  inferred  from  the  other,  the  latter  is  practically 
more  useful  in  the  following  aspects: 

1.  Theorem  7.2  covers  more  while  the  inequality  (5.1)  of  Theorem  5.1  covers 
nonnegative  definite  matrices  only; 

2.  Theorem  7.2  is  more  friendly  in  the  sense  that  it  bounds  directly  on  6j  in 
the  expression  A  j  =  Ay  (1  +  6j)  which  makes  it  easy  to  bound  variations  of 
RelDistp  as  shown  in  Proposition  2.3  and  Part  II  of  this  series  [22]. 

Ostrowski  theorem  also  applies  to  singular  value  problems  of  matrices  B  and 
B  =  D^BD  by  working  with  Hermitian  matrices 


(7.1) 
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Corollary  7.1  Let  B  and  B  =  D\BD2  be  two  m  x  n  matrices,  where  D\  and 
D2  are  nonsingular.  Denote  their  singular  values  as  in  (3.2)  and  arrange  them 
in  descending  order  respectively  as  m  (3.3).  Then 


min{(7mln(_Di)2,  (Tmm(D2)2}  <  —  <  rnax{<7max(71i)2,  <7max(712)2} 

(Ti 


which  gives 


^ H  <  max{||7-  DjDiHa,  \\I  -  D*D2\\2}, 
aj 

or  in  another  words, 

<?j  =  <7j(l  +  Jj)  with  \jj  |  <  max{||7  —  D)Di\\2,  \\I  —  D2D2\\2}. 
for  j  =  1,  2,  •  •  •,  n. 

This  corollary,  though  it  is  an  immediate  consequence  of  the  above  Ostrowski 
theorem  and  the  equation  (7.1),  has  appeared  no  where.  Corollary  7.1  also  has 
a  advantage  over  Theorem  6.8s  and  the  inequality  (5.4)  of  Theorem  5.2  in  that 
it  bounds  directly  on  jj  in  the  expression  3y  =  cry ( 1  +  jj).  Of  course,  one  can 
develop  bounds  on  jj  with  little  effort  from  Theorem  6.8s  and  Theorem  5.2.  It 
turns  out  that  Corollary  7.1  provides  a  less  sharper  bound  than  the  following 
theorem  due  to  Eisenstat  and  Ipsen  [10]. 

Theorem  7.3  (Eisenstat-Ipsen)  Assume  the  conditions  are  as  described  m 
Corollary  7.1.  Then 


which  yields 


^min  (-Dl)Cmin  (D2)  <  ^  <  ^max  (D  i)  ^max  (A) 

(Ti 


— - —  <  max]  1 1  -  <7  min  (A  )<7min  (A  )  I ,  |1  -  <7max(A)<7max(A)|}, 

or  m  another  words,  fry  =  cry  ( 1  +  qy)  with 

\jj\  C  max{  1 1  (Tminirr  )^min  (7^2  )  1 7  1 1  <7max  (dl  )<7max  (772  )  |  }  , 

for  j  =  1,2,  •  •  -,n. 

Theorem  7.3  always  provide  a  sharper  bound  than  Corollary  7.1  does,  as  the 
following  lemma  indicates. 

Lemma  7.1  For  f,  (  >  0, 

max{  1 1  —  ^ 2 1 ,  1 1  —  C2 1 }  >  1 1  —  -f C I ,  (7-2) 

and  the  eguahty  sign  holds  if  and  only  iff  =  (. 
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Proof:  The  inequality  is  obvious  if  either  max{^,C}  <  1  or  min{^,C}  >1.  It  is 
also  clear  if  either  £  =  1  or  (  =  1.  Now  it  suffices  for  us  to  consider  the  case 

when  0  <  ^  <  1  <  C- 

1.  1  -  >  C2  -  1  =>  +  C2  <  2  =>  <  1  =>  1  -  >  1  -  =  |i  -  £CI; 

2.  l  -  <  C2  - 1  =>  +  C2  >  2  =>  +  C2  >  ■f2  +  C2  >  2  =>  C2  -  i  >  i  -  £C; 

also  C2  >  ec  =>  c2  -  1  >  £C  -  1-  So  C2  —  1  >  |1  —  £CI- 

From  the  above  proof,  it  is  clear  that  max{|l  —  £2|,  |1  —  £2|}  =  1 1  —  C I  if  and 

only  if  £  =  (.  I 

Regarding  to  graded  matrices,  the  following  two  theorems  are  due  to  Demmel 
&  Veselic  [9]  and  Mathias  [25]. 

Theorem  7.4  (Demmel- Veselic)  Let  the  conditions  of  Theorem  5.4  hold. 
Arrange  the  eigenvalues  of  H  =  D*  AD  and  H  =  D*  AD  descendmgly  as  in 
(3.4).  Then 

1*liM<\\a-%\\aa\\2 

or  in  another  words, 

X^Xjil  +  Sj)  with  I^IIT-ilMIATIIs, 
for  j  =  1,2,  •  •  -,n. 

Theorem  7.5  (Mathias)  Let  the  conditions  of  Theorem  5.3  hold.  Arrange 
the  singular  values  of  G  =  BD  and  G  =  BD  descendmgly  as  in  (3.3).  Then 

^i^A<\\b-%\\ab\\2, 

aj 

or  in  another  words, 

Vj  =  ^'(f  +  Ti)  with  iTil  <  ||B_1||2||AB||2, 
for  j  =  1,2,  •  •  -,n. 

Finally,  let  us  see  what  we  can  get  from  Theorems  7.2,  7.4,  7.5  and  7.3  and 
Corollary  7.1,  in  terms  of  the  two  kinds  of  relative  distances  defined  in  §2. 

1.  From  Theorem  7.2,  it  follows 

RelDistp(Aj ,  Aj)  <  RelDistoo(Aj ,  Aj)  <  \\L  —  D*D\\2,  (7-3) 

RdDist^A,)  <  117 (7-4) 

Vmm(D) 
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The  inequality  (7.3)  holds  because 


RelDistoo (A j ,  Xj )  =  ^  <  \\I  -  D* D\\2; 

max{|Aj  |,  |Aj  |)  \Xj\ 


and  the  inequality  (7.4)  holds  because 


RelDist(Aj ,  Xj) 


|Ay  ~  Ay| 

\f  I  A?  I  I  A?  I 


|Ay  ~  Ay|  [\X~\  ^\\I  -  D*  P\\2 

I  A?  I  y  | Ay  |  ^min(D) 


2.  From  Corollary  7.1,  we  have 


RelDistoo (<7j ,  &j) 
RelDist((Tj ,  <7 j) 


<  max{||/-DtD1||2,  ||/-Dp2||2}, 

<  max{ 1 1 J  —  D* Di 1 1 2 ,  ||/ -  D^D2||2} 

—  min{(7mm(Di)  5  ^min  m) 


(7.5) 

(7.6) 


3.  From  Theorem  7.3,  it  follows 


RelDistoo  (<7j  ,  &j ) 

<  max{ 1 1  -  <7mm(5i)<7mm(5>2)|,  |1  -  crmax (i3>i )crmax (i3>2 ) | } ,  (7.7) 
RelDist(<7j ,  cij) 

^  max{  1 1  CTmin  (7^1  )<Tmin  (52)  | ,  |1  crmax(-5i  )(7max(52)  | }  g^ 

(-Dl)<7min(D2) 

The  inequalities  (7.7)  and  (7.8)  are  sharper  than  (7.5)  and  (7.6),  respec¬ 
tively. 

4.  From  Theorem  7.4,  we  have 


RelDistoo  (Aj  ,  A  j) 


RelDist(Aj ,  A  j) 


<  II^IMIATH,, 

^  11^4.— 1 1|2||  AxF||2 

-  \/l  —  ||T-1||2||AT||2 


(7.9) 

(7.10) 


The  inequality  (7.10)  has  been  derived  in  Theorem  5.4. 
5.  From  Theorem  7.5,  it  follows 

RelDistoo  (<7j,<7j)  <  ||5_1||2||A5||2, 


RelDist(<jj ,  <jj)  < 


15 


-i| 


IASI 


Vl-||5-i||2||A5||2 


(7.11) 

(7.12) 


The  inequality  (7.12)  turns  out  to  be  sharper  than  the  last  “<”  in  (5.9) 
of  Theorem  5.3. 
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8  Remarks  on  Generalized  Eigenvalue  Problems 
and  Generalized  Singular  Value  Problems 

In  this  section,  we  are  going  to  say  a  few  words  for  the  following  perturbations. 
As  we  shall  see,  the  results  in  previous  sections,  as  well  as  those  in  Li  [22],  can 
be  applied  to  derive  relative  perturbation  bounds  for  them. 

•  Generalized  eigenvalue  problem: 

H1-XH2  =  D*lA1D1-\DlA2D2  and  Hi  -  Ai?2  =  D\AiDi  -XD*2A2D2 
with  all  Ai  and  Ai  positive  definite  and  ||A~1||2||A8'  —  A{\\2  <  1,  where  Di 
are  some  square  matrices  and  one  of  them  are  nonsingular. 

•  Generalized  singular  problem: 

,  G2}  =  {B1D1 ,  B2D2]  and  {Gi,G2}  =  {B\D\,  B2D2]  with  all  Bi 
and  Bi  nonsingular  and  ||5!“1||2||5!-  —  Bi\\2  <  1,  where  Di  are  some  square 
matrices  and  one  of  them  is  nonsingular. 

For  the  above  mention  generalized  eigenvalue  problem,  without  loss  of  any  gen¬ 
erality,  consider  only  the  case  when  D2  is  nonsingular.  Then  the  generalized 
eigenvalue  problem  for  Hi  —  XH2  =  D\A\Di  —  XD2A2D2  is  equivalent  to  the 
standard  eigenvalue  problem  for 

A~1/2D-1D*1AiDiD~1A~1/2]  (8.1) 

and  the  generalized  eigenvalue  problem  for  Hi  —  XH2  =  D\AiDi  —  XD2A2D2 
is  equivalent  to  the  standard  eigenvalue  problem  for 

D*A21/2D21D*1AiDiD21A21/2Dy  (8.2) 

where  AA2  d=  A2  -  A2  and  D  =  D*  d=  (I  +  A~  1/2(AA2)A~ 1/2)-!/2.  So 
bounding  relative  distances  between  the  eigenvalues  of  Hi  —  XH2  and  these  of 
Hi  —  XH2  is  transformed  to  bounding  relative  distances  between  the  eigenvalues 
of  the  matrix  (8.1)  and  these  of  the  matrix  (8.2).  The  latter  can  be  accomplished 
in  two  steps: 

1.  Bounding  relative  distances  between  the  eigenvalues  of  the  matrix  (8.1) 
and  these  of 

D*A~1,2D^1D*iAiDiD^1A~1,2D;  (8.3) 

2.  Bounding  relative  distances  between  the  eigenvalues  of  the  matrix  (8.3) 
and  these  of  the  matrix  (8.2). 

As  to  the  above  mention  generalized  singular  problem,  we  shall  consider 
their  corresponding  generalized  eigenvalue  problems  [20,  32,  34]  for 

D\BlBiDi  -  XD*2B*2B2D2  and  D\BlBiDi  -  XD*2B*2B2D2, 


instead. 


40 


9  Proofs  of  Theorems  6.1  and  6.4 


To  prove  the  theorems,  we  need  a  little  preparation.  A  matrix  Y  =  (yij)  E  Mnxn 
is  doubly  stochastic  if  all  yij  >  0  and 

n  n 

Vik  =  y2ykj  =  1  for  k  =  1,  2,  •  •  • ,  n. 
k  = 1  k= 1 

A  matrix  P  E  Mnxn  is  called  a  permutation  matrix  if  exactly  one  entry  in  each 
row  and  each  column  equals  to  1  and  all  others  are  zero.  Let  e;  be  the  ith  column 
vector  of  In.  Each  permutation  matrix  P  corresponds  to  a  unique  permutation 
r  of  {1,  2,  •  •  • ,  n}  so  that 


P  —  (er( l)i  er( 2)7  '  7 ^r(n)): 

and  vice  versa.  The  following  wonderful  result  is  due  to  Birkhoff  [5]  (see  also 
[17,  pp.  527-528]). 

Lemma  9.1  (Birkhoff)  An  nxn  matrix  is  doubly  stochastic  if  and  only  if  it 
lies  in  the  convex  hull  of  nl  permutation  matrices. 

Lemma  9.2  Let  Y  =  (yij)  be  an  n  x  n  doubly  stochastic  matrix,  and  let  M  = 
( rriij )  E  Cnxn.  Then  there  exists  a  permutation  r  o/{l,2,  •••,«}  such  that 

n  n 

'y  ]  \mij\  Itij  —  'y  y  I miT(i)  I  • 

i,  j  =  1  i  =  1 

Proof:  Denote  all  n  x  n  permutation  matrices  as  P j.,  and  their  corresponding 
permutations  of  {1,2,  •••,«}  as  rj,,  where  k  =  1,2,  •••,«!.  It  follows  from 
Lemma  9.1  that  Y  can  be  written  as 


n! 

Y  =  J2akPk, 

k  =  1 

where  >  0  and  ~  1*  Hence 

n  nl  n  n 

\mij\2yij  =  y2ak  \miTk(i)\2  > 

i,  j  =  1  =  l  2  =  1  —  —  2  =  1 

as  was  to  be  shown.  I 

The  trick  in  the  above  proof  is  quite  standard.  It  was  first  used  by  Hoffman 
and  Wielandt  [16],  and  Sun  [31]  used  it  to  prove  a  Hoffman-Wielandt  type 
theorem  for  a  special  class  of  matrix  pencils. 

The  following  lemma  is  due  to  Eisner  and  Friedland  [12]. 
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Lemma  9.3  (Eisner- Friedland)  LetY  =  {yij)  E  Cnxn.  Then  there  exist  two 
n  x  n  doubly  stochastic  matrices  Yi,  Y2,  so  that  entrywisely 


<Tmin002  Yi  <  (\yij\2)  <  <Tmax(Y)2  Y2 , 

where  <7m;n(Y)  and  crmax(Y)  are  the  smallest  and  largest  singular  values  of  Y , 
respectively. 

Proof  of  Theorem  6.1:  Let  us  first  derive  our  perturbation  equations. 

X~1{A-A)X  =  AX-1  X  -  X-1  XA, 

A-  A  =  A  -  D*AD2  =  A  -  AD2  +  AD2  -  D*AD2 
=  A(I  —  D2)  +  (Df*  —  I)  A, 

X~1(A  —  A)X  =  X~1XA  —  ’AX~1X, 

A-  A  =  A  -  D*AD2  =  A  —  D*A  +  D*A  —  D*AD2 
=  (I-DDA  +  MDf1  -I). 

Thus,  we  have 

AX^X-X^XA  =  AX_1(7-  D2)X  +  X~1(Df*  -  I)XA,  (9.1) 
X^XA-AX^X  =  X_1(7-  D^XA  +  AX-^Df1  -I)X.  (9.2) 

Set  Y  d=  X^X  =  {yij),  E  =  X“1(7  -  D2)X  =  (e^)  and  E  =  X~1{Df*  - 
I)X  =  (eij).  Then  the  equation  (9.1)  reads  AY  —  YA  =  AE  +  EA,  or  compo- 
nentwisely  A;i/;j  -  yijXj  =  X^j  +  so 

|(AS'  -  Aj)t/S'j|2  <  (| A,  |2  +  \Xj  |2)(|es'j  |2  +  \Tij  |2), 

which  yields 

\eij  1 2  +  ICy  1 2  —  [RelDist2(Ai,  Aj)]  \yij\2. 

Hence 

n  2 

||X-1(7-772)X||^  +  ||X-1(77r*-7)X||^  >  [llelDist2(A8,  A,)]  | ytJ\2  (9.3) 

i,j  =  1 

which,  together  with  Lemmas  9.3  and  9.2,  show  that 

n  2 

|  (A-1  (7  —  D2)X\\p  +  WX-1  {Df*  —  T)X\\2F  >  <7mm(Y)2]T  [RelDist2(A,- ,  At(0)] 

2  =  1 

for  some  permutation  r  of  {1,  2,  •  •  • ,  n}.  Since 


cmm(Y)  =  IJY-1  H2  1  =  IIX-^IIJ1  >  ||A-1||2-1||X||2-1, 


42 


Wx-'mxiUwx-hi  -  D2)X\\1  +  II X- w  -  m\2F 


>  ||x-1||2||x||2«7min(y)  X]  |'RelDist2(Ai,  Ar(j)) 


n  2 

>  J£|  RelDist2(Az-,  Ar(z-))J  . 


Set  Y  =  X  1X  =  (yz-j).  Similarly,  we  get 

n  2 

||X-1(7-^)X||^  +  ||X-1(D2-1-7)X||^  >  ]T  [RelDist2(A8,  A,)]  l&.f 

i,j  =  1 

which,  together  with  Lemmas  9.3  and  9.2,  show  that 

n  2 

||X-1(7-77t)X||^  +  ||X-1(772-1-7)X||^  >  <rmm(Y)2  ^  [RelDist2(A,-,  At(0)]  . 

2  =  1 

cmin(y)  =  ll?-1!!^1  =  iiac-^iu1  >  ||x-1||21||x||21. 

Along  the  lines  as  we  were  proceeding  in  (9.4),  we  will  reach 

ll^lhimht/llA-1^  -  D\)XfF  +  II X-W1  -  ^11- 


n  2 

Y  .  ^  (  RelDist2(Ai,Ar(0) 


The  inequality  (6.1)  is  now  a  simple  consequence  of  (9.4)  and  (9.5).  I 

A  proof  of  Theorem  6.4  is  based  on  the  following  result  due  to  Li  [21,  pp.  207- 
208].  For  a  X  £  Cmxn  ,  introduce  the  following  notation  for  a  k  x  l  submatrix 
of  A  =  (xij): 


il  ■  ■  ■  4  \  def  ®*2ji  ®*2j2 

./•••./•  /  :  : 


Vtk]l  •L‘1kj2 


where  1  <  i\  <  ■  ■  ■  <  <  n  and  1  <  j\  <  ■  ■  ■  <  ji  <  n. 
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Lemma  9.4  (Li)  Suppose  that  A  £  Cnxn  is  nonsingular,  l<*i  <  -  •-  <*"*  < 
n  and  1  <  j\  <  •  •  •  <  ji  <  n,  and  k  +  £  >  n.  Then 


X 


i  1  •  •  •  4 
ii •  •  •  4 


Moreover,  if  X  is  unitary  then 


Proof  of  Theorem  6Jt:  Let  k  be  the  index  such  that 

r)p  =f  max  RelDistp(A8  ,  As)  =  RelDistp(Aj, ,  A*). 

1  <  i  <  n 


If  rjp  =  0,  the  inequality  (6.7)  is  trivial.  Assume  r)p  >  0.  Also  assume,  without 
lose  of  any  generality,  that 

^k  >  A*  >  0. 

Partition  X,  A-1,  A  and  A-1  as  follows: 


A  =  (A1;A2),  A'1 


Wf 

w2* 


,X  =  (X1,X2),X~1 


WJ 

w2* 


where  Ai,  W\  £  Cnxk  and  Ai,  W\  £  Cnx4  1)j  and  write  A  =  diag(Ai,  A2)  and 
A  =  diag(Ai,  A2),  where  Ai  £  WLkxk  and  Ai  £  ]R(fc_1)x(fc_1).  It  follows  from  the 
equations  (9.1)  and  (9.2)  that 

kiWfX2  -  WfX2A2  =  (I  —  D2)X2  +  Wf  (Df*  —  T)X2A2,  (9.7) 

w;xlkl-l2w;xl  =  w;(i  -  Dt)AiAi  +  a2(a*(d^1  -  j)Ai  (9.8) 

which  gives 

WfX2-  AfkWfX2A2  =  Wf(I  -  D^  +  A^WfiDf*  -  I)X2A2,  (9.9) 
W*2Xl-A2W*2XlAf1  =  W*2{I- D*1)Xl  +  A-fWfiDf1  -  7)AiAj"1.  (9.10) 

Lemma  9.4  implies 


wfx2 

> 

r 

(X~1X)~1 

-1 

> 

r 

A-1  A 

_1  >  ||A-1||-1||A||-1 

r 

WfX! 

> 

r 

1 

1 

-1 

> 

r 

1 

"1>||A-1||r-1||A||r-1 

r 

since  Wf  X2  is  a  k  x  (n  —  fc  +  1)  submatrix  of  A  1  A,  and  W*2  Ai  is  a  (n  —  fc  +  1)  x  k 
submatrix  of  A-1A  and  k  +  (n  —  k  +  1)  =  n  +  1  >  n.  So  it  follows  from  (9.9) 
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that 


)  ii*-iriu*r1 


< 

< 

< 

< 

< 

< 


i  - 


^ k 


w;x2 

w:x2 


w,*x2 


-llAr1 

-i 


w?x2 


Ai  WlX2A2 


W?X2-Ai1W?X2A2\\ 

II  r 

W?(I  -  D2)X2  +  A -  I)X212 


W?(I-D2)X 


2 


r 


^ k 
^ k 


W?(D~* 


ll^llIrll^Hr  ^||/-i52||r+^||i3r-/||rj 
\\x-1\\r\\x\\r^ji  +  ^i/\\ I  -  D2 1 |r  +  Wi-D-^Wh 


Similarly,  it  follows  from  (9.10)  that 


||x-1||r-1||x||r-1 

<  II^IMW  ^l  +  ^f  <J\\I  -  D,1 1|?  +  \\I  -  Dl\\l. 

The  inequality  (6.7)  is  now  a  simple  consequence  of  above  inequalities.  I 
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10  Proofs  of  Theorems  6.7  and  6.8 


Proof  of  Theorem  6.7:  We  assume,  without  lose  of  any  generality,  that  m  =  n; 
otherwise,  we  can  augment  B  and  B  with  zero  blocks  of  suitable  size.  For 
example  if  m  >  n,  we  do 

—  (B,  0mm_n),  B\  —  -0idiag(_Z}2 >  7m_n) . 

Since  this  way  only  increases  the  number  of  zero  singular  values,  and  Proposi¬ 
tion  2.7  says  that  zero  singular  values  should  be  always  paired  to  zero  ones,  we 
still  have  (6.13)  in  the  end  once  we  prove  it  for  B\  and  B\. 

Assume  now  m  =  n  and  let  the  singular  value  decompositions  of  B  and  B 
be  as 

B  =  IJTV*  and  B  =  UTV* ,  (10.1) 

where  U,  V,  U ,  V  £  UTn  and 

E  =  diag(<7i,  •  •  • ,  <7n)  and  E  =  diag((?i ,  •  •  • ,  an).  (10-2) 

Notice 

U*(B-B)V  =  Y,V*V  -U*UY,, 

B-B  =  B  -  D\BD2  =  B  -  BD2  +  BD2  -  D\BD2 
=  B(I  -  D2)  +  (Df*  -  I)B. 

Thus,  we  have 

TV*V  -U*UY,  =  TV* (I  -  D2)V  +  U*(Df*  -  I)UT.  (10.3) 
One  the  other  hand,  we  have 

U*(B-B)V  =  U*UT-TV*V, 

=  (I  -  Df)B  +  BiDf1  -I). 

Thus,  we  have 

U*U E  -  TV*V  =  U*(I-  D\)UT  +  TV*(Df1  -  I)V. 

Taking  conjugate  transpose  in  both  sides,  we  get 

TU*U  -V*VT  =  TU*(I  -  Di)U  +  V*(Df*  -  I)VT.  (10.4) 
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Set  Q  =  U*U  =  (qij)  and  Q  =  V*V  =  (qij).  Both  are  unitary.  Similarly  to  the 
derivation  of  the  inequality  (9.3),  from  the  perturbation  equations  (10.3)  and 
(10.4)  one  can  get 


Since 


\\i-D2\\2F  +  \\i-Dr \\2f 


>  £ 

i,j= i 


I  Vi  Qij  Qij<7j  I 

af  +  a] 


>  £ 

i,j= i 


I  Vi  Qij  Qij<7j  I 

af  +  a] 


(10.5) 

(10.6) 


Qij  Qij &j  | 


®i  Qij  Qij  & j 


2 


=  ai\qij\2  +  hj\2v2j  -  2^(<t 

+  ^ihj  I2  +  Iftil2^2  -  ^((TiqijqijVj) 
>  K  - Zj)2(hj\2  +  Wij\2), 


where  -K(-)  takes  the  real  part  of  a  complex  number.  The  last  “>”  holds  because 

23t(<Tiqij  qij  a j)  f  ct  j  cTj  ( | ij ;j  T  \qij\  ), 

23t(<Tiqij  qij  a j)  f  ct  j  CTj  ( |  <j ;j  T  ij ;j  ). 

Now  adding  the  corresponding  two  sides  of  the  inequalities  (10.5)  and  (10.6) 
leads  to 


nr  -  d2\\f  +  HI  -  or  111  +  l|r  -  Allt  +  Hi  -  or  111 

>  2^  [RelDist2(^,?J)]2  fel2+lfel2 

i,j  =  1 

I  1 2  I  i~  |  2 

It  is  easy  to  see  that  the  matrix  whose  (i,  j)th  entry  is  q'3  2'q'3'  is  a  doubly 
stochastic  matrix.  Hence  applying  Lemma  9.2  leads  to  the  inequality  (6.13).  I 

Proof  of  Theorem  6.8:  Similarly  to  the  remark  we  made  at  the  beginning  of  the 
above  proof,  we  may  assume,  without  lose  of  any  generality,  that  m  =  n  because 
of  Proposition  2.5.  Then  still,  we  have  the  perturbation  equations  (10.3)  and 
(10.4).  Let  k  be  the  index  such  that 

Tjp  =f  max  RelDistp(<78',  3y)  =  RelDistp  (it  ,  aj.). 

l<i<n 

If  Tjp  =  0,  the  inequality  (6.16)  is  trivial.  Assume  r)p  >  0.  Also  assume,  without 
lose  of  generality,  that 

Pk  >  >  0. 

Partition  U,  V,  U ,  V  as  follows 

U  =  (U1,U2),V  =  (V1,V2),U  =  (U1,U2)  and  v  =  (vltv2), 
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where  Ui,  Vi  £  Cnxfc  and  fo1;  y1  £  ynx(k  i)_  Write  E  =  diag(Ei,E2)  and 
E  =  diag(Ei,E2),  where  Ei  £  ¥Lkxk  and  Ei  £  M (k-i)x(k-i).  jq  follows  from  the 
equations  (10.3)  and  (10.4)  that 


XiV?V2  -  U?U2Z2 
XiU?U2  -  VfV^X-i 

which  yield 

V^V2-T,^UlU2% 
U*U2  -  E-i^V^E, 


E iV*(I  -  D2)V2  +  U*(Di*  -  I)U2 E2, 
E ^(I  -  Di)Cf2  +  ^(DJ*  -  7)72E2 


i/;(/  -  d2)v2  +  sr1^* (^r*  -  i)u^  (10-7) 

U*(I  -  Di)Cf2  +  E”1!/^!?-*  -  7)72E2.  (10.8) 


Lemma  9.4  implies  that 


UW2 


V?V2 


=  1,  since  U*  U2  is  a  k  x  (n  —  7  +  1) 


submatrix  of  U*U  £  4Tn  and  Vj*  V2  is  a  k  x  (n  —  k  +  1)  submatrix  of  V*V  £  4J4 
and  k  +  (n  —  k  +  1)  =  n  +  1  >  n.  So  it  follows  from  (10.7)  that 


1  &k 


< 

< 


v?v2 

v;v2 


-lis 


1  112 


uru2 


l|s2||2 


Ej“1C/*C/2E2 


V?V2-'Zi1UZU2-L2 


V* (7  -  D2)V2  +  E ^U*(D~*  -  7)[72E2 


<  \\i-D2\\2  +  ^\\Dr  -i\\2 

&k 


< 


'1  + 


7-£>2||«  +  ||£>r*-7||«. 


Therefore 


Vp  = 


1  &kj&k 

Vi  +  - 


Similarly,  it  follows  from  (10.8)  that 
1  &kj &k 


Vp  = 


<  \7p-^2ii!  +  ii7?r- 


<  yWt-DiWl  +  \\d2*  - 


Vi  +  kl/4  ~ 

The  inequality  (6.16)  is  a  consequence  of  the  last  two  inequalities. 
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11  Proof  of  Theorems  5.1,  5.2,  5.3  and  5.4 

Proof  of  Theorem  5.1:  Since  A  is  nonnegative,  there  is  a  matrix  B  £  Cnxn  such 
that  A  =  B*  B.  Withjhis  B,  A  =  D*  AD  =  D*B*BD  =  B*B,  where  B  =  BD. 
Let  SVDs  of  B  and  B  be  as 

B  =  UA1/2V*  and  B  =  UA1/2V*, 

where 

A1/2  =  diag(^Ai ,  •  •  • ,  \f\n)  and  A1/2  =  diag  (\J\,  •  •  • ,  \J\^j  ■ 

In  what  follows,  we  actually  work  with  BB*  and  BB* ,  instead  of  A  =  B* B  and 
A  =  B*B. 


BB*-BB*  =  BD*B*  -  BD~1B* 

=  B(D*  —  D~1)B* , 
U*(BB*-BB*)U  =  AU*U  —  U*UA, 

U*B(D*  —  D~1)B*U  =  A  1/2V*(D*  -  D-^VA11'2. 

Thus,  we  have  the  following  perturbation  equation. 

AU*U  -  U*IJ A  =  A1/2V*(D*  -  D-v)VA1/2. 


(11.1) 


Write  Q  =  U*U  =  (qij).  It  follows  from  (11.1)  that 

\\V*(D*  -  D-1) V\\2f  =  || D*  -  D-Yf  >  E  ^7=l^'|2- 

bi=1  yA^Aj 

Since  (|5ij|2)  is  a  doubly  stochastic  matrix,  applying  Lemma  9.2  concludes  the 
proof  of  the  inequality  (5.2).  To  show  (5.1),  let  k  be  the  index  such  that 

r)p  =f  max  RelDist(A8  ,  As)  =  RelDist(Aj, ,  Aj,). 

1  <  i  <  n 


If  T]p  =0,  no  proof  is  necessary.  Assume  rjp  >  0.  Also  assume,  without  lose  of 
any  generality,  that 

A*  >  Aj,  >  0. 

Partition  U,  V,  U ,  V  as  follows 

U  =  (U1,U2),V  =  (y1,V2),U  =  (U1,U2)  and  R=(R1;R2), 

where  UlyVi  ff  Cnxfc  and  Uly  Ri  £  Cnx(^“1),  and  write  A  =  diag(Ai,A2)  and 
A  =  diag(Ai,  A2),  where  Ai  £  TLkxk  and  Ai  £  M (*- !) x (* - !) _  it  follows  from  the 
equation  (11.1)  that 

A2(72*i7i  -  UfUiAi  =  AlJ2Vf{D*  -  T>_1)ViAJ/2 
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which  yields 


a2u; ihkf1  -  u;ui  =  a  1/2v,*{d*  -  d-l)v 


1  A  TZ.  A  _1/2 


(11.2) 


Lemma  9.4  implies  that 


mu  i 


=  1  since  UfU,  is  a  (n  —  k  +  1)  x  k  submatrix 


of  U*U  and  k  +  (n  —  k  +  1)  =  n  +  1  >  n.  So  it  follows  from  (11.2)  that 


^ k 
^ k 


= 

u;u  1 

|A2||2 

mu. 

< 

u;u 1 

2 

k^Uikf1 

HAr'ih 


< 


U2Ui  -k2UfUikf 


klJ‘2Vf{D*  -  D-r)V lA 


-hv.  a-1/2 


<  llA2/2||2 


V2(D*  -  I4_1)Vi  1 1 A 


-1/2  | 


V2*(D*  -  D~v)Vi 


< 


-II  d*-d-% 


an  immediate  consequence  of  which  is  the  inequality  (5.1). 
Proof  of  Theorem  5.2:  Set  B  =  BD2  and  denote 

a(B)  =  {a,  >  a2  >■■■  >  dn}. 

Applying  Theorem  5.1  to  B* B  and  B* B  =  D2B* BD2  leads  to 
max  RelDist(of,  of)  <  || D2  —  Df1 1 1 2 , 

1  <  i  <  n 


n  2 

Y  ,<??)]  <  m  -  Df1  ||F. 

\  i=i 


Now  applying  Proposition  2.11,  we  obtain 


max  RelDist(<78',  <7j)  <  ~\\D2  -  D2  1||2, 

1  <  i  <  n  Z 


n 

,  I  'Y,  [RelDist(e 
\  i= 1 


<  ^m-o^wF. 
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Similarly  for  BD2  and  D\BD2,  we  have 


max  RelDist((T8',  oy)  <  -||£)*  -  D1  ||2, 

1  <  i  <  n  Z 


n  ^  ___  ^  2  1 

.  ^  [RelDist(hy,5;)]  <  2  11^1  “  -D1~1|I^- 

\  i= 1 


Since  RelDist  is  a  generalized  metric  on  M>o,  we  get 

RelDist(<78',  3y)  <  RelDist((T8',  oy)  +  RelDist(<78',  3y) 

1 


<  TrUi^-^r'ih  +  ii^-^ik) 


n  2  n 

,  |RelDist((T8',  <t8')J  <  ,  [RelDist(<78' ,  hy)  +  RelDist(<78,  oq) 

\  8=1  \  8=1 


< 


RelDist(oq ,  3q) 


\S3 


\|  X]  [RelDist (3q,oq) 
8=1 


<  TAm-D^\\F  +  \\D*-D^\\F) 


as  expected.  I 

Proof  of  Theorem  5.3:  Write 

G=(B  +  A  B)D  =  (I  +  (A  B)B~1)BD  =  DG, 

where  D  =  I  +  (AB)B~1 .  Now  applying  Theorem  5.2  above  to  G  and  G  = 
DG  yields  the  first  inequalities  in  both  (5.9)  and  in  (5.10).  To  get  the  second 
inequalities,  we  notice 

OO  OO 

(I  +  E)*  -  (I  +  E)-1  =  I+E*  -  ^(-lfE*  =  E*  +  E  +  eJ2(~  lfE*-1, 

i  =  0  i  =  2 

where  E  =  (A B)B~1,  and  therefore  for  any  unitarily  invariant  norm  |||  •  ||| 
\\\(I+E)*  -  (1  +  E)-1 1  < 


\\E  +  E* 


E* 


pi 


E  n 

2  =  1 

Plk 
1-  11^112 


Pill 


The  rest  is  trivial. 
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Proof  of  Theorem  5.4:  Rewrite  H  and  H  as 

H  =  D*AD  =  (A1/2D)*A1/2Dd=B*B, 

H  =  D*A1/2(I  +  A-1/2(AA)A~1/2)A1/2D 

=  ((/  +  A-^2(AA)A-^2)1/2A^2Dy  (I  +  A-^2(AA)A-^2)1/2A^2D 

=f  B*B, 

where 

B  d=  A1/2D, 

B  (I  +  A-^2(AA)A-^2)1/2A^2D. 

Set  D  =  (/  +  A~1I2{AA)A~1I2)112 .  Thus  B  =  DB.  Notice  that  A (H)  = 

A (B*B)  =  A (BB*)  and  A (H)  =  \(B*B)  =  A (BB*)  and  BB*  =  DBB*D* .  So 
applying  Theorem  5.1  to  BB*  and  BB*  yields  the  first  “<”  in  both  (5.13)  and 
(5.14).  ■ 


52 


12  Proof  of  Theorem  6.6 


There  is  nothing  to  prove  if  A  E  A(A).  Assume  that  A  ^  A(A).  Here  we  will 
prove  the  case  when  A  =  DA  only,  since  the  proof  for  the  case  when  A  =  AD 
is  very  similar.  Consider  A  —  XI. 

A -XI  =  A-  XI  +  A-  A 

=  X(A  —  A/)X_1  +  (D  —  7)XAX_1 
=  A  +  X_1(D  —  I)XA(A  —  A/)-1j  (A-XI)X-1. 

Since  A  —  XI  is  singular,  we  have  for  any  1  <  p  <  oo 

\\X~1(D  -  I)XA(A  -  A/)-1  ||p  >  1 

which  gives 

1  <  \\X~1(D  -  /)X||p| |A(A  -  A/)-1  ||p  =  \\X~1(D  -  I)X\\P  max 
as  was  to  be  shown.  I 
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A  Is  RelDistp  a  Metric? 

In  this  appendix,  we  will  prove  (2.22)  under  certain  conditions.  As  a  result,  we 
will  see 

1.  RelDistp  is  a  metric  on  M>o; 

2.  RelDisti,  RelDist2  and  RelDistoo  are  metrics  on  M. 

We  strongly  conjecture  that  RelDistp  is  a  metric  on  C.  Unfortunately,  we  are 
unable  to  prove  it  at  this  point. 

Lemma  A.l  The  following  statements  are  equivalent: 

1.  RelDistp(a,  7)  <  RelDistp(a,  /?)  +  RelDistp(/3,  7); 

2.  RelDistp(^a,  £7)  <  RelDistp(^a,  f(3)  +  RelDistp(£/?,  £7)  for  some  0  7^  f  £ 

C; 

3.  RelDistp(^a,  £7)  <  RelDistp(^a,  f/3)  +  RelDistp(^/3,  ^7)  for  all  0  (  6  C. 

The  proof  of  this  lemma  is  trivial,  just  by  Property  3  of  Proposition  2.1.  With 
Lemma  A.l  in  mind  and  that  swapping  ex  and  7  does  not  lose  any  generality, 
we  may  assume  from  now  on 

ex  <\a\<  7.  (A.l) 

The  inequality  (2.22)  is  trivial  when  one  of  the  a,  /?,  7  is  zero  or  [3  =  ex  or  [3  =  7. 
So  from  now  on,  we  may  assume 

ex,  f3,  7  7^  0  and  ex  (3  7.  (A. 2) 

Now  there  are  three  possible  positions  for  f3: 

/3  <  ex  or  ex  <  [3  <  7  or  7  <  f3. 

When  ex  <  0,  we  split  the  case  (3  <  ex  into  two  subcases: 

(3  <  —7  or  —  7  <  (3  <  a. 

Also  in  the  case  ex  <  0,  without  loss  of  generality,  we  may  assume  ex  =  —  1  by 
Lemma  A.l.  We  summarize  the  above  cases  we  have  to  handle  separately  as 
follows. 

1.  ex  <  (3  <  7; 

2.  0:7  >  0,  i.e. ,  ex  and  7  are  of  the  same  sign; 

3.  a  =  —  1  <  1  <  7  <  /?; 
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4.  —7  <  (3  <  ex  =  —  1  <  1  <  7;  and 

5.  j3  <  —7  <  a  =  —  1  <  1  <  7. 

Lemma  A. 2  (2.22)  holds  for  a  <  /?  <  7,  and  the  equality  sign  holds  if  and 
only  if  f3  =  ex  or  (3  =  7. 

This  lemma  actually  implies  that  (2.22)  holds  if  /?  lies  between  a  and  7  for  all 
ex,  7  E  M,  not  just  these  a  and  7  satisfying  (A.l). 

Proof:  Assume  ex  (3  7.  Because  of  (A.l),  we  have  7  —  0  =  7  —  (3  +  (3  —  a 

and  thus  for  1  <  p  <  00 


RelDistp(a,  7) 


7  —  a 

<Jlp  +  \a\P 


l~/3  P  -ex 

<Jlp  +  \P\ p  V\(3\p+\a\P 


l~/3 

Jp  +  \a\P 


(3  —  ex 
<jlp  +  \cx\p 


+(7  -  (3) 

+(/?  —  ex) 


1 


1 


<Jlp  +  \a\p  <Jlp  +  \(3\p 

1  1 


</7p  +  \a\p  f/\a\p  + 


=  RelDistp(a,  (3)  +  RelDistp(/3,  7) 

,  (1-I3)(\I3\P  -  Hp)  Vr  +  \(3\P  -  Vr  +  \ex\P 
f/jP  +  \cx\p  f/jP  +  \(3\p  \f3\p  —  \ex\P 

,  (J3  -  ex)(\l3\P  -  7P)  f/\ex\P  +  \(3\p  -  t/yP  +  \ex\P 

f/jP  +  \cx\Pf/\ex\P +  \)3\p  \(3\p  -  jp 


Now  if  a  <  (3  <  |a|  <  7,  then  \/3\p  —  \a\p  <  0  and  \/3\p  —  jp  <  0,  and  thus 


(7  ~  (3)(\(3\P  ~  Hp)  f/jP  +  \(3\p  ~  f/jP  +  \ex\P 
f/jP  +  \a\P  f/jP  +  \(3\p  \(3\p  —  \ex\f 

,  ((3  -  a)(\(3\P  -  7p)  f/\a\P  +  \/3\p  -  f/j p  +  |q|P  ^  n 

<J~t p  +  \a\Pf/\a\P  +  \(3\p  \(3\p-1p 

Hence  RelDistp(a,  7)  <  RelDistp(a,  (3)  +  RelDistp(/3,  7).  Consider  now  |a|  < 
(3  <  7.  Then 

(7  ~  I3)( \I3\p  ~  \ex\P)  f/jP  +  \(3\p  -  f/jP  +  \cx\p 
f/jP  +  \a\Pf/jp  +  \(3\p  \(3\p-\a\P 

,  (J3  -  ex)(\l3\P  -  7p)  Vlp  +  \a\P 

f/j  p  +  \a\Pf/\a\P  +  \(3\p  \(3\p-jP 
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Lemma  A. 3  (2.22)  holds  for  ay  >  0. 

Proof:  Lemma  A. 2  shows  that  (2.22)  is  true  if  a  <  (3  <  y.  If  either  (3  <  a  or 
7  <  (3,  (2.22)  follows  from  Property  8  of  Proposition  2.1.  I 

As  an  immediate  consequence  of  Lemma  A. 3,  we  have 

Proposition  A.l  RelDistp  is  a  metric  on  M>o- 

Lemma  A. 4  (2.22)  holds  for  —y  <  (3  <  a  <  0  <  |a|  <  y,  and  the  equality  sign 
holds  if  and  only  if  (3  =  a. 

Proof:  Assume  (3  a.  Define 


f(()  =f  _L±j_  for  |a|  <  ^  <  7. 
1 KSJ  yyp  _|_  £/>  1  —  ^  ' 
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Clearly  if  p  =  oo,  /(£)  =  7±£  increases  in  |a|  <  f  <  7;  if  0  <  p  <  00,  we  have 


no  = 

So  f(f)  is  an  increasing 
RelDistp(a,  7)  =  /(—a) 
as  was  to  be  proved. 


7(f 


,p  —  1  _  c  p  - 1 


>  0,  for  |a|  <  £  <  7. 


(7P  -f  £p)1+i/p 
function  for  all  p.  Hence 

<  /(—/?)  =  RelDistp(/3,  7)  <  RelDistp(a,  /3)+RelDistp(/3,  7), 


Proposition  A. 2  RelDisti,  RelDist2  and  RelDistoo  are  metrics  on  M. 

Proof:  We  have  to  prove  (2.22)  with  p  =  1,2  and  00  for  all  5  cases  listed  at  the 
beginning  of  this  appendix.  But  Case  1  has  been  covered  by  Lemma  A. 2,  Case 
2  by  Lemma  A. 3,  and  Case  4  by  Lemma  A. 4.  Cases  3  and  5  are  to  be  dealt 
with  by  Lemmas  A. 5  and  A. 6  below. 

Lemma  A. 5  ( 2.22)  with  p  =  1,2  or  00  holds  for  a  =  — 1  <  1  <  7  <  (3.  When 
p  =  1,  2,  the  equality  sign  holds  if  and  only  if  [3  =  7;  when  p  =  00,  the  equality 
sign  holds  if  and  only  if  either  (3  =  7  or  7  =  1. 

Proof:  Assume  /3  yt  7.  First  consider  the  case  p  =  2.  Define 

def  £  +  1  £  ~  7 

y/£2  + 1  v^F+t2 

We  are  going  to  show  that  /'(£)  >  0  for  £  >  7  and  thus 

RelDist2(— 1,  /?)  +  RelDist2(/3,  7)  =  /(/?)  >  7(7)  =  J  +  1  =  RelDist2(-l,  7) 


which  concludes  the  proof  for  the  present  case.  Since 

f'(f)  =  1  .  ?(£  +  ?) 

J  ^2  _|_  7)3/2  ^2  _|_  t2^3/2  • 

So  to  show  /'(£)  >  0,  it  suffices  for  us  to  show  for  £  >  7  >  1 

t(£ + t)(^2  + 1)3/2  >  (£  -  no + 7 2)3/2, 

or  equivalently,  to  show  for  £  >  7  >  1 

(€-1)2(€2  +  72)3  -  72(€  +  t)2(€2  +  1)3<0. 
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But  tedious  algebraic  manipulations  yield  the  following 

(€-1)2(€2  +  72)3  -  72(€  +  t)2(€2  +  1)3 

=  ~74  +  7®  -  273(1  +  y3)^  +  72(74  -  l)^2  -  673(1  +  y)£3 

-6t2(1  +  7)^  +  (1  -  74)^  _  2(1  +  73)^  +  (1  -  72)^8 
=  -74  +  73(73  -  6€3)  -  273(1  +  73)^  -  72€2  +  7^2(73  -  6£3) 

_67^3  _  6t2€5  +  (1  _  74)€6  _  2(1  +  t3)€7  +  (1  _  ^8 

<  0, 

as  required.  This  completes  the  proof  for  p  =  2. 

We  have  to  show  (2.22)  for  p  =  1  or  oo.  For  the  moment,  let’s  see  what 
is  the  implication  of  (2.22)  for  any  1  <  p  <  oo  for  this  particular  case.  Notice 
7  +  l  =  /3+l  —  (/3  —  7)  and 

RelDistp(— 1, 7)  =  =  4^=  ~  4=^ 

P  "  p  +  1  {/yP  +  1  VyP  +  1 

P  + 1  P~ 7 
<JPP  +  1  +  </WTyP 

^  +  1  “  ^p  +  l)  (  ^7p  +  1  +  ^/?p  +  7p  ) 

=  RelDistp(— 1,  /?)  +  RelDistp(/3,  7) 

,  ,  VWTT-  iFFTT  <fWTY+  ^ TT 

^7P  +  1  ^//?P  +  1  V  ”  {/yP  +  1^//?P  +7P 
So  (2.22)  holds  if  and  only  if 


+  1-  </yP  +  1 


+  7p 


<  (/?  -  7)  (  <//?p  +  7P  +  </7p  +  1 )  </Pp  +  1, 


or  equivalently 

<//?p  +  7P  ((7  +  1)  </Pp  +  1  -  (/?  +  1)  ^7P  +  l)  <  (P  ~  7)  +  7P  <//?p  +  1 

which  is  true  if  and  only  if 


i/WTyP 


7+1 

#T T 


P  + 1 
^  +  1 


<  P-J- 


(A.3) 


Our  proof  will  be  completed  if  we  can  prove  (A.3)  for  p  =  1  or  00.  When  p  =  1, 
the  left-hand  side  of  (A.3)  is  zero  and  its  right-hand  side  is  [3  —  7  >  0.  When 
P  =  00, 


the  left-hand  side  of  (A.3)  =  /? 


Hence  (A.3)  holds  for  both  p  =  1  and  00. 


P  +  1 


P  ~  7 
7 


<P~1- 
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Lemma  A. 6  (2.22)  with  p  =  1,2  or  oo  holds  for  (3  <  —7  <  a  =  —1  <  1  <  7, 
and  is  strict,  unless  p  =  00  and  7  =  1. 

Proof:  We  want  to  prove  for  p  =  1 ,  2  and  00 


RelDistp(— 1, 7)  <  RelDistp(— 1,  (3)  +  RelDistp(/3,  7), 

which,  by  Lemma  A.l,  is  equivalent  to 

RelDistp(l,  —7)  <  RelDistp(l,  —(d)  +  RelDistp(— (3,  —7). 

Set  £  =  —  (3.  Then  £  >  7  >  1.  For  the  moment,  let’s  see  what  is  the  implication 
of  (2.22)  for  any  1  <  p  <  00  for  this  particular  case.  Notice  that  7  +  1  = 
£  +  7  —  (£  —  1),  and  thus 


RelDistp(l,  —7)  = 


7  +  1  ^  +  7  £  - 1 


<Jyp  +  1  ^7P  +  1  f/jp  +  1 

€  +  7  ,  €  -  1 


w  +  yp  f/WTT 


+(£  +  7) 


1 


</iv  +  1  </fp  +  7 


1  1 


,/c  (t  .^tt+^tt 

,  ,  v/tr.  ,  Is  i)~ 


</ip  + 1  </fp  +  jp 

So  (2.22)  holds  if  and  only  if 


d/yP  +  1#+T 


f/£P  +  yP  -  J’/yP  +  1  f/£P  1  p/tp  1 

(6  +  7)^—  7  ,  - - (/|-i)vc  VT  <  n  (A. 4) 

^7P  +  1  </£p  +  yP  f/yP  +  1  </£p  +  1  “ 

or  equivalently 

(t+OV^  +  i(Vtp+^-  +  1)  <  ((-i)Vyp  +  (p  (Ve  +  1+  ^7P  + 1) 

or  equivalently 

<Jyp  +  £p  ((y  +  1)  Ve  +  i  -((- 1 )  {/ yp  +  1)  <  (7  +  0  </ip  +  1  +  1 

which  holds  if  and  only  if 

V7p+e  <T  +  €.  (A.5) 


V^TT  ^TT 

We  have  to  show  (A.5)  (or  (A. 4))  for  p  =  1,2  and  oo.  When  p  =  2,  We  will 
prove  (A. 4)  by  showing  for  £  >  y 


(i  +  y)(e-i) 


-(€-1) 


V72  +  lV^2  +  72  (V72  +  1  +  V^2  +  72)  V  v^TT  VC2  +  1 


<  0 


(A.6) 
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and  thus  our  proof  is  completed.  To  show  our  claim,  first,  we  notice  that  the 
inequality  (A. 6)  is  equivalent  to 


_ (g  +  7)(g  +  1) _ ^  <  1  +  1 

\h2  + iV^2  +  t2  {\h2  + 1  +  Vi2  +  t2)  Vi2  + 1  v^TT 

or  equivalently 


(V^TT+Ve+7)  (Ve+T+V^+T)- 

7  (A.7) 

Notice  that 


The  left-hand  side  of  (A.7) 
The  right-hand  side  of  (A.7) 


<  (£  +  7)(£  +  l), 

>  (V^2  + 1  +  Vi2  + 1)  , 


and 


(V^2  +  i  +  \/i2  +  i)  —  (£  +  t)(£  +  i) 

=  ^2  +  l  +  72  +  l  +  ^\/  <f2  +  l^y2  +  1  —  S,2  —  (7  +  1)£  —  7 

>  2^2  +  l>/72  +  l-(7+l)€ 

>  0, 

because 


(7  +  W 
4(^2  +  l)(72  +  l) 


=  e272  +  2^27  +  e2 

<  3^272  +e2, 

=  4f272  +  4<f2  +  472  +  4. 


Next,  we  are  going  to  show  (A. 5)  for  p  =  1  and  00.  When  p  =  1, 


the  left-hand  side  of  (A. 5)  =  (7  +  £) 


2 

T+T 


<  7  +  ^; 


When  p  =  00, 


£  -\-  y 

the  left-hand  side  of  (A. 5)  =  -  <  £  +  7, 

7 

and  the  equality  sign  holds  if  and  only  if  7  =  1. 

By  now  the  proof  of  that  RelDisti ,  RelDist2  and  RelDistoo  are  metrics  on  M  is 
completed.  I 

We  briefly  summarize  what  we  have  proved  in  this  appendix. 
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1.  When  p  =  1,  2  or  oo,  (2.22)  is  true  for  all  a,  /3,  7  £  M,  and  thus  RelDisti, 
RelDist2  and  RelDistoo  are  metrics  on  M; 

2.  (2.22)  is  true  for  all  a,  /?,  7  >  0  and  for  all  1  <  p  <  00,  and  thus  RelDistp 
for  any  1  <  p  <  00  is  a  metric  on  M>o; 

3.  (2.22)  for  1  <  p  <  00  survives  to  Case  1,  Case  2  and  Case  Jt.  But  we 
do  not  know  whether  it  survives  to  Case  3  and/or  Case  5.  We  believe  it 
would.  Showing  (2.22)  survives  to  Case  3  is  equivalent  to  showing  (A. 3) 
for  1  <  7  <  /?;  and  showing  (2.22)  survives  to  Case  5  is  equivalent  to 
showing  (A. 5)  for  1  <  7  <  £. 
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B  Is  RelDist  a  Generalized  Metric? 


In  this  appendix,  we  will  prove  (2.23)  under  certain  conditions.  As  a  result,  we 
will  see 

1.  RelDist  is  a  generalized  metric  on  M>o,  and  a  metric  on  M+; 

2.  RelDist  is  not  a  generalized  metric  on  M  (nor  on  C,  of  course). 

Similarly  to  Lemma  A.  1,  we  also  have 

Lemma  B.l  The  following  statements  are  equivalent: 

1.  RelDist(a,  7)  <  RelDist(a,  (3)  +  RelDist(/3,  7); 

2.  RelDist(^a,  £7)  <  RelDist(^a,  ((3)  +  RelDist(^/3,  £7)  for  some  0  £  C; 

3.  RelDist(<fa,  £7)  <  RelDist  (fa,  ((3)  +  RelDist(£/?,  £7)  for  all  0  p-  f  £  C. 

This  lemma  follows  from  Property  3  of  Proposition  2.8.  Again,  now  with 
Lemma  B.l  in  mind  and  that  swapping  a  and  7  does  not  lose  any  generality, 
we  may  assume  (A.l)  holds,  and  also  only  cases  (A. 2)  are  interesting.  Following 
the  similar  arguments,  we  see  nontrivial  cases  are  exactly  the  same  5  cases  as 
we  summarized  in  §A. 

Lemma  B.2  The  inequality  (2.23)  holds  for  a  <  (3  <  7,  and  the  equality  sign 
holds  if  and  only  if  (3  =  a  or  f3  =  7. 


This  lemma  actually  implies  that  (2.23)  holds  if  (3  lies  between  a  and  7  for  all 
a,  7  E  1,  not  just  these  a  and  7  satisfying  (A.l). 

Proof:  Assume  a  p  f3  p  y.  Because  of  (A.l),  we  have  7  —  a  =  7  —  (3  -\-  (3  —  a, 
and  thus 


RelDist(a,  7)  = 


7  —  a  7  —  (3  (3  —  a 


RRl  vRtI  vRtI 

7  —  (3  (3  —  a 

\ZVtF\  + 

/  \ 

1  1 


+(7  -  ft) 


,RRT  \Z\W\ 

RelDist(a,  /?)  +  RelDist(/3,  7) 

,  /  ^  VW\  -  RM  /  0  x 

+(t  -P) - 7== - \P-  a) 

VPl3!  I 


+  ((3  —  a) 


VPl\ 

Pl-VW\ 

V\aftl\ 
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Now  if  a  <  (3  <  |a|  <  7,  then  \f\ff\  —  y/\a\  <  0  and  \f\ff\  —  ^/j  <  0,  and  thus 


(7-/3) 


VW\  -  Vw 
V\aPi\ 


iP  -  a) 


Vj-VW\ 

V\aPi\ 


<  0. 


Hence  RelDist(a,7)  <  RelDist(a, /3)  +  RelDist(/3,  7).  Consider  now  |a|  <  (3  <  7. 
Then 


\/pPl\  VPPt\ 

<  h _ _  h)&z£. 

V\aPi\  V\aPi\ 

(Vj  -  VP)(VP  -  VW\)(Vi  -  >/H) 


pa(3j 


<  0, 


as  required.  I 

Lemma  B.3  (2.23)  holds  for  07  >  0. 

Proof:  Lemma  B.2  shows  that  (2.23)  is  true  if  a  <  (3  <  7.  If  either  (3  <  a  or 
7  <  (3,  (2.23)  follows  from  Property  6  of  Proposition  2.8.  I 

As  a  immediate  consequence  of  Lemma  B.3,  we  have 
Proposition  B.l  RelDist  is  a  metric  on  M>o- 

Lemma  B.4  If  a  <  0  <  —a  <7  <  /?,  then  the  inequality  (2.23)  holds,  and  the 
equality  siqn  holds  if  and  only  if  (3  =  7. 


Proof:  Assume  (3  p  7.  By  Lemma  B.l,  we  may  assume  a  =  —1.  Then  we  want 
to  have 

7  +  1  P  +  1  (3~1 

VP  VP  VPl’ 

or  equivalently, 

(l  +  l)\p3  ~  (P  +  1  )Vl  +  (P  -  7)  <  0. 

Since 


(7  +  PVP- (P  +  PVr- (P-i)  = 


< 


i\fp  +  VP-  PVP  -  Vi  +  (7  -  P) 
'/FfiVl  -  \fp)  +  v^3  -  Vi  +  (7  -  P) 

{VP  -  VP)(VPi  -  1  +  V7  +  VP) 

0, 


as  was  to  be  shown. 
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Lemma  B.5  When  a  =  —1  <  0  <  —a  <  7,  ihe  inequality  (2.23)  holds  for  all 
j3  <  a  =  —  1  if  and  only  if  7  <  3  +  2y/2.  If  however,  7  >  3  +  2y/2,  then  (2.23) 
holds  for  (3  < 

Proof:  The  inequality 

7  +  1  <  -1  ~  P  1  ~  P 

y/l  ~  V~1 P 

is  equivalent  to 

(7  + 1  )\Z-P-  (-1  -  P)Vi  -  (7  -  P)  <  0. 

Write  —(3  =  (2 ,  so  the  above  inequality  reads 

-€2(V7+l)  +  (7+l)€+(v^-7)  <0.  (B.l) 


So  that  (2.23)  holds  for  all  (3  <  a  =  —1  requires  the  inequality  (B.l)  is  true  for 


all  (  >  1.  Since  the  two  zeros  of  —f2(^/j+  1)  +  (7  +  1)£  +  (y/j  —  7)  are  (  =  1 
and  (  =  ^77,  and 


7  ~  \fl 

y/l  +  1 


<  1 


gives  7  <  3  +  2\[2,  we  know  that  (2.23)  holds  for  all  (3  <  a  =  —  1  if  and 
only  if  7  <  3  +  2y/2.  If,  however,  7  >  3  +  2y/2,  then  (2.23)  is  violated  for 
</?<-!•  ■ 

We  may  summarize  how  (2.23)  is  doing  under  the  5  distinguished  cases. 


1.  (2.23)  survives  to  Case  1  by  Lemma  B. 2; 

2.  (2.23)  survives  to  Case  2  by  Lemma  B.3; 

3.  (2.23)  survives  to  Case  3  by  Lemma  B.4; 

4.  (2.23)  dies  at  Cases  )  and/or  5 ,  unless  7  <  3  +  2-^2  by  Lemma  B.5. 
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